W3C home > Mailing lists > Public > www-tag@w3.org > August 2006

Re: TAG Issue proposal: URIs should not be hierarchical

From: Fernando Franco <avoid.spam.account@gmail.com>
Date: Mon, 28 Aug 2006 03:17:04 -0300
Message-ID: <001f01c6ca69$986090c0$eb8c31c8@enterprise>
To: "W3C-TAG" <www-tag@w3.org>
Cc: "Karl Dubost" <karl@w3.org>


----- Original Message ----- 
From: "Karl Dubost" <karl@w3.org>
To: "Fernando Franco" <avoid.spam.account@gmail.com>
Cc: "W3C-TAG" <www-tag@w3.org>
Sent: Monday, August 28, 2006 12:12 AM
Subject: Re: TAG Issue proposal: URIs should not be hierarchical



>>Le 28 août 06 à 10:21, Fernando Franco a écrit :
>> Today URIs can be something like
>> http://example.org/something/something/something
>> I reckon they should only be
>> http://example.org/something

>Karl replies:
>Why do you think this is hierarchical? Or mor exactly which kind of
hierarchy are you thinking of?

The thread was originally to be called "URIs should not have a hierarchical
part", but I thought shorter would do. You are correct, though, it needs
disambiguation, my mistake. I was thinking specifically about the path (more
below).

Btw, curious election this word "path", isn't it?
I don't know if there is some solid rationale for it, or is due to
historical reasons. I suspect association with (hierarchical) filesystems. I
should research it a bit more, but my worry is if it compromises in any way
the location-independence aspect of things in end.users perception.


>First, It seems the issue described above is about "HTTP URIs" and
not URIs.

If all URIs are names, I think "URIs should not have a path part" would
apply to all URIs.

>Let's take an example
>http://example.org/life/is/beautiful/
>can perfectly be equivalent to
>http://example.org/is/beautiful/life/
>or even
>http://example/pathhasnomeaning/

>How? for example in Apache
>RewriteEngine On
>RewriteRule ^life/is/beautiful(.*) /is/beautiful/life$1
>RewriteRule ^life/is/beautiful(.*) /lifehasnomeaning$1

>The fact that you can read a URI and that you can put slash into it
>doesn't have more meaning than anything else.

Unfortunately I can not recall in which spec, but I think I saw something
specifically declaring slashes as the method for denoting hierarchy. If that
is correct, then slashes *do* have a special characteristic (i.e,
life-is-beautiful is not quite the same than life/is/beautiful ).

I understand the rewriting mechanism, thanks. And I have nothing against
some webmaster who wished to keep thinking in terms of hierarchies
*internally*, as long. as all URI's exposed to the world do not contain a
path part.
Unfortunately, the existance of the rewriting mechanism by itself, does not
guarantee that. And hence end-users get URIs (names) with unnecesary
structure.

>> Please see:
>> http://lists.w3.org/Archives/Public/www-tag/2006Aug/0066.html
>> (which contains the true rationale)
>>
>> I did check the specs, btw, and a lot of them mention the
>> hierarchical thing in no uncertain terms.
>>
>> This contains an extra paragraph about hierarchies-classifications
>> and their problems in xml:
>> http://lists.w3.org/Archives/Public/www-tag/2006Aug/0086.html


>It is explained in "1.2.3.  Hierarchical Identifiers". The hierarchy
>in HTTP URIs is not the "sequences of slashes" but the sequence of  parts.

Yes, I was not refering to the scheme, authority, path, query, fragment
hierarchy, but just to the path.

>Then you could argue that the hierarchy is explicitly described in  RFC
3986,

>[[[
>The path segments "." and "..", also known as dot-segments, are
>defined for relative reference within the path name hierarchy. They
>are intended for use at the beginning of a relative-path reference
>(Section 4.2) to indicate relative position within the hierarchical
>tree of names. This is similar to their role within some operating
>systems' file directory structures to indicate the current directory
>and parent directory, respectively. However, unlike in a file system,
>these dot-segments are only interpreted within the URI path hierarchy
>and are removed as part of the resolution process (Section 5.2).
>]]] -- http://www.ietf.org/rfc/rfc3986.txt
>        Mon, 24 Jan 2005 20:28:56 GMT

>The slash in a path gives possibility to map a "hierarchical tree of
>names", it is a convenience it doesn't have a specific meaning more
>than the one of organizing information space for the URIs owner.

But I do seem to recall that slashes do have a specific meaning, associated
to hierarchies. And since names are not hierarchical, this brings confusion.
Please note that among my arguments, are:
+ Names are not hierarchical (and that is a reason, per se, regardless of
anything else)
+ Same as in XML, adding *one* organization means either to leave out all
other possible ones, or creating a lot of URI's for the same resource, one
for each possible organization. Is this desirable?
+ I suggest that other than the word "locator" in "URL", the other main
reason people think of URI's as locations instead of names, is the existence
of paths (regardless of wether they are correct in doing so or not).
+ With the advent of the semantic web, we can have all organizations and
relationships via metadata. No need to push one (and only one) into URIs (or
to create tons of URIs for the same resource).


>For example, an analogy with alphabet, there is no specific meaning of
having
>A, B, C, D, E, F, etc.
>Though it's quite practical to create a dictionary of words, and be
>able to access the information. The hierarchy of the alphabet has no
>specific meaning, it doesn't follow a rule of an encyclopedic
>organization of the world.

This example doesn't quite hold, imho, cause it is an ordering, not a
hierarchy. No things-within-things.

>Chinese characters will have different classification orders to make
>them easier to find by number of strokes or by radical characters.

This one does, though.

>Again it has no meaning, it is just a convenience.

But is it really "convenient"?
Names are not trees. And we do not represent ontologies via trees, but via
graphs.
Graphs can be navigated any way the user prefers, not just one given by a
webmaster.

>the ".", ".." and "/" is a mechanism to "navigate" an information
>space that the URIs owner has chosen to organize.

See above.

>At least the way I interpret it.

Thanks, Karl. :)
Received on Monday, 28 August 2006 06:17:36 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:47:41 GMT