Re: Problems I cannot get past with using relative URIs for identity. from John Cowan on 2000-05-19 (xml-uri@w3.org from May 2000)

From: John Cowan <jcowan@reutershealth.com>
Date: Fri, 19 May 2000 10:01:32 -0400
To: Ray Whitmer <ray@xmission.com>, "xml-uri@w3.org" <xml-uri@w3.org>
Message-ID: <3925493C.DC847D@reutershealth.com>
Ray Whitmer wrote (slightly reformatted):

> For example, from a user's point of view, and probably most developers as well,
>
> 1)	http://first.com/second/third/fourth.html
>
> with a relative reference to
>
> 2)	../fifth/sixth/seventh.html
>
> is the same as a reference to
>
> 3)	http://first.com/second/fifth/sixth.com.

I assume you mean:

3)	http://first.com/second/fifth/sixth/seventh.html

> In this case, to use the argument of
> the day, the relative reference is a legal URI, but the RFC would resolve it,
> I suspect, as
>
> 4)	http://first.com/second/third/../fifth/sixth/seventh.html,
>
> which is not at all the same identity.

No, that is false.  The RFC (which I do wish people would read, particularly
clause 5 and Appendix C) prescribes the process in detail.  After #4 is generated,
the rule specified in 5.2.6.e is applied, which causes the string "third/.."
to be removed from #4, leaving #3. (Any process giving equivalent results
is acceptable, of course.)  Therefore, it is #3 that is sent to the server.

If #4 *were* to be sent to the server, the results are technically
unpredictable, since ".." and "." have no magic meaning in absolute URIs.
Of course, since most servers are either Unix or Windows NT, they will be
given the usual effect in practice, but servers based on other operating
systems may or may not give effect to the "..".

> I think it would be improper for a
> theoretical modified namespace spec which dictated absolutization of relative
> URI references to mandate that these be treated as distinct names, when the
> server is allowed and expected by all to return the same resource.

The principle is that absolutization is done without reference to the
net or other parts of the Real World.  There may and will be URIs
that would retrieve the same content, by reason of being ftp: rather than
http:, or using a different hostname, or whatever, that will not be
recognized as the same as #1.

> If these
> are, indeed references, then the server should be permitted to have the last
> say on identity.

If that were true, it would indeed be too expensive to use URIs as namespace
names, and in fact unstable.

> 2.  The flexibility of URIs does not seem to extend to absolutization.
> RFC-specified absolutization favors legacy unix file path syntax, throwing
> away CGI parameters, for example.

This is also false.  If a "?parameters" part is present in either the base URI
or the relative URI reference, it will appear in the result as well.

> 5.  Once you have convinced everyone that namespaces should access resources,
> how do you get consensus about what the function of the resource should be?

Again, distinguish between an entity body (a byte sequence with associated
media type) and a resource.  The URI locates or names the resource, which
may be associated with varying identity bodies or with none.  As such,
the URI can be looked up in a database to determine associated resources.

> If I were doing this, I would get the same functionality from a
> separate PI, and not destroy the abstractness / generality of the data.

(The WG tried that.  Didn't fly.)

-- 

Schlingt dreifach einen Kreis um dies! || John Cowan <jcowan@reutershealth.com>
Schliesst euer Aug vor heiliger Schau,  || http://www.reutershealth.com
Denn er genoss vom Honig-Tau,           || http://www.ccil.org/~cowan
Und trank die Milch vom Paradies.            -- Coleridge (tr. Politzer)
Received on Friday, 19 May 2000 10:01:50 UTC