Re: Idea: Authority-declared sub-syntax for URL paths

Randall,

Just two questions to clarify:

1. How do you plan to classify path formats?
2. Wouldn't it require changing RFC 3986?

(Broadly speaking, any additional information regarding the URI 
processing may be put as part of the path in the form of ";param=values" 
like in 'ftp' URIs 
(http://tools.ietf.org/html/draft-yevstifeyev-ftp-uri-scheme-05#section-3.1).  
But unless you find the answer to question 1, the idea doesn't seem to 
be sufficient enough to employ this way.)

Mykyta

08.08.2011 7:56, Randall Sawyer wrote:
>
> Hello, All!
>
> Only recently have I stumbled upon the need to parse and normalize 
> URLs for a couple of projects I'm working on.  In doing my research - 
> including reading all of rfc3986 and part of A. Barth's "Parsing URLs 
> for Fun and Profit" - I find it frustrating the amount of effort 
> required to anticipate and correct malformed URLs.  I have a 
> suggestion as to how content-providers and client-developers may 
> voluntarily make their services and products work better together.  [I 
> have searched the archives for something like this, and have not found 
> any so far.]
>
> What I have in mind is something comparable to SGML/XML validation.  
> Just as a *ML document may contain a declaration at the top stating 
> that it is compliant with a specific template, what if we made it 
> possible for an organization to declare that every existent path on 
> their site is compliant with a specific path-syntax template?
>
> Imagine going to visit a city - and instead of just running in head 
> long, hoping you'll be able to catch on to the local customs - you 
> first pause at the gates long enough to read the placard listing the 
> local customs.
>
> The former case is very much like the status quo of parsing and 
> correcting each path segment, hoping for success.  If a browser - on 
> the other hand - was provided a set of guidelines as to the 
> characteristics of a normalized path on that site, then computation 
> time decreases, and access to content is facilitated.
>
> I already anticipate some issues:
> 1)  Where to put the placard, and what to name it.  These need to be 
> the same for every site - or perhaps some universally named meta-data 
> pointing TO the placard. [By 'placard', I mean path-syntax-template]
>
> 2)  Declared compliance is not the same as actual compliance - same 
> goes for an *ML file, though.  That is the responsibility of the 
> author(ity).
>
> 3)  What if a content-provider decides to opt for a path syntax which 
> covers MOST, but NOT ALL, of its existing paths?  The template then 
> would need to also include a list of exceptional paths (perhaps using 
> a wildcard if the offending path is an upper level directory).
>
> Any thoughts?  Is this desirable?  Would it potentially interfere with 
> existing protocols or standards?
>
> Randall
>

Received on Monday, 8 August 2011 08:52:27 UTC