W3C home > Mailing lists > Public > public-iri@w3.org > August 2011

Idea: Authority-declared sub-syntax for URL paths

From: Randall Sawyer <srandallsawyer@gmail.com>
Date: Sun, 7 Aug 2011 21:56:55 -0700
Message-ID: <CACJYzX32b5VeLhULSj-eLnLn94ZZu-nmXgAfkasd1ZDMSDc5UA@mail.gmail.com>
To: public-iri@w3.org
Hello, All!

Only recently have I stumbled upon the need to parse and normalize URLs for
a couple of projects I'm working on.  In doing my research - including
reading all of rfc3986 and part of A. Barth's "Parsing URLs for Fun and
Profit" - I find it frustrating the amount of effort required to anticipate
and correct malformed URLs.  I have a suggestion as to how content-providers
and client-developers may voluntarily make their services and products work
better together.  [I have searched the archives for something like this, and
have not found any so far.]

What I have in mind is something comparable to SGML/XML validation.  Just as
a *ML document may contain a declaration at the top stating that it is
compliant with a specific template, what if we made it possible for an
organization to declare that every existent path on their site is compliant
with a specific path-syntax template?

Imagine going to visit a city - and instead of just running in head long,
hoping you'll be able to catch on to the local customs - you first pause at
the gates long enough to read the placard listing the local customs.

The former case is very much like the status quo of parsing and correcting
each path segment, hoping for success.  If a browser - on the other hand -
was provided a set of guidelines as to the characteristics of a normalized
path on that site, then computation time decreases, and access to content is

I already anticipate some issues:
1)  Where to put the placard, and what to name it.  These need to be the
same for every site - or perhaps some universally named meta-data pointing
TO the placard. [By 'placard', I mean path-syntax-template]

2)  Declared compliance is not the same as actual compliance - same goes for
an *ML file, though.  That is the responsibility of the author(ity).

3)  What if a content-provider decides to opt for a path syntax which covers
MOST, but NOT ALL, of its existing paths?  The template then would need to
also include a list of exceptional paths (perhaps using a wildcard if the
offending path is an upper level directory).

Any thoughts?  Is this desirable?  Would it potentially interfere with
existing protocols or standards?

Received on Monday, 8 August 2011 04:57:23 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:14:42 UTC