- From: David G. Durand <david@dynamicdiagrams.com>
- Date: Wed, 14 Jun 2000 18:31:10 -0400
- To: xml-uri@w3.org
> > You get http://www.w3c.org/pop_empty_stack
>
>You might get that
>
>> The URI specification handles that. You can not pop off a part between //
>> and the next /
>
>True, but you can raise an error, or you might keep the ../ at least
>that's what the rfc says:
>
> g) If the resulting buffer string still begins with one or more
> complete path segments of "..", then the reference is
> considered to be in error. Implementations may handle this
> error by retaining these components in the resolved path (i.e.,
> treating them as part of the final URI), by removing them from
> the resolved path (i.e., discarding relative levels above the
> root), or by avoiding traversal of the reference.
>
In other words, even under absolutization, identity is not well
defined for all BASE/URI-reference pairs, as the result of some
absolutizations is undefined.
Given the fact that the base URI is mutable, this makes equality of
absolutized URIs a rather shaky basis for universal identification of
element types.
This is interesting, as this indeterminacy in the RFC had not leaped
out at me previously.
This seems like a knock-down, drag-out argument that URI
absolutization is not well enough defined to enable namespace
recognition, or even document correctness checking, since a parser
must have knowledge of the correct URI for a document to validate it.
For instance, I might want to make sure that my namespace
declarations are correct, but kick off my parser in my public_html
directory. It would be valid, as the relative uri
"../../../foo.namespace" is legal for the
file:///usr/users/dgd/public_html/ BASE URI for the file.
Unfortunately, the relevant server BASE URI
http://example.server/~dgd/ leads to an error, and no well-defined
namespace.
This is just a variation on the old problem of relative URIs being
breakable, but it's worse because this is a problem that can't be
reliably checked-for, since the URI of a document is a variable thing.
However in this case, it affects the validity of the document, and
not just the results of processing it. This problem is fixable, but
only by addinng additional verbiage to the RFC, explaining how (for
namespace purposes _ONLY_) URI processing must work deterministically
in this case, overriding:
>
> g) If the resulting buffer string still begins with one or more
> complete path segments of "..", then the reference is
> considered to be in error. Implementations may handle this
> error by retaining these components in the resolved path (i.e.,
> treating them as part of the final URI), by removing them from
> the resolved path (i.e., discarding relative levels above the
> root), or by avoiding traversal of the reference.
>
And of course, from a practical point of view, this is a nightmare.
Going back to the example, the real intended base URI is in fact the
server-alias:
http://example.server/namespaces/user-defined/dgd/
Which is also mapped to the same directory in the file system.
This context stuff is capable of burning one in many different ways.
Mixing it with a unique-identification function is just asking for
trouble.
-- David
--
_________________________________________
David Durand dgd@cs.bu.edu \ david@dynamicDiagrams.com
http://cs-people.bu.edu//dgd/ \ Chief Technical Officer
Graduate Student no more! \ Dynamic Diagrams
--------------------------------------------\ http://www.dynamicDiagrams.com/
\__________________________
Received on Wednesday, 14 June 2000 18:39:37 UTC