- From: David G. Durand <david@dynamicdiagrams.com>
- Date: Wed, 14 Jun 2000 18:31:10 -0400
- To: xml-uri@w3.org
> > You get http://www.w3c.org/pop_empty_stack > >You might get that > >> The URI specification handles that. You can not pop off a part between // >> and the next / > >True, but you can raise an error, or you might keep the ../ at least >that's what the rfc says: > > g) If the resulting buffer string still begins with one or more > complete path segments of "..", then the reference is > considered to be in error. Implementations may handle this > error by retaining these components in the resolved path (i.e., > treating them as part of the final URI), by removing them from > the resolved path (i.e., discarding relative levels above the > root), or by avoiding traversal of the reference. > In other words, even under absolutization, identity is not well defined for all BASE/URI-reference pairs, as the result of some absolutizations is undefined. Given the fact that the base URI is mutable, this makes equality of absolutized URIs a rather shaky basis for universal identification of element types. This is interesting, as this indeterminacy in the RFC had not leaped out at me previously. This seems like a knock-down, drag-out argument that URI absolutization is not well enough defined to enable namespace recognition, or even document correctness checking, since a parser must have knowledge of the correct URI for a document to validate it. For instance, I might want to make sure that my namespace declarations are correct, but kick off my parser in my public_html directory. It would be valid, as the relative uri "../../../foo.namespace" is legal for the file:///usr/users/dgd/public_html/ BASE URI for the file. Unfortunately, the relevant server BASE URI http://example.server/~dgd/ leads to an error, and no well-defined namespace. This is just a variation on the old problem of relative URIs being breakable, but it's worse because this is a problem that can't be reliably checked-for, since the URI of a document is a variable thing. However in this case, it affects the validity of the document, and not just the results of processing it. This problem is fixable, but only by addinng additional verbiage to the RFC, explaining how (for namespace purposes _ONLY_) URI processing must work deterministically in this case, overriding: > > g) If the resulting buffer string still begins with one or more > complete path segments of "..", then the reference is > considered to be in error. Implementations may handle this > error by retaining these components in the resolved path (i.e., > treating them as part of the final URI), by removing them from > the resolved path (i.e., discarding relative levels above the > root), or by avoiding traversal of the reference. > And of course, from a practical point of view, this is a nightmare. Going back to the example, the real intended base URI is in fact the server-alias: http://example.server/namespaces/user-defined/dgd/ Which is also mapped to the same directory in the file system. This context stuff is capable of burning one in many different ways. Mixing it with a unique-identification function is just asking for trouble. -- David -- _________________________________________ David Durand dgd@cs.bu.edu \ david@dynamicDiagrams.com http://cs-people.bu.edu//dgd/ \ Chief Technical Officer Graduate Student no more! \ Dynamic Diagrams --------------------------------------------\ http://www.dynamicDiagrams.com/ \__________________________
Received on Wednesday, 14 June 2000 18:39:37 UTC