- From: Kay, Michael <Michael.Kay@softwareag.com>
- Date: Tue, 5 Aug 2003 17:04:00 +0200
- To: uri@w3.org
- Message-ID: <DFF2AC9E3583D511A21F0008C7E62106073DD02F@daemsg02.software-ag.de>
This is clearly a great improvement on RFC 2396. It is disappointing, but not really surprising, that the document still contains so many words like "should", "recommended", "unwise", "generally counterproductive", "discouraged", and "abnormal", which all tend to give the impression that handling URIs is a black art rather than a precise science. RELATIVE URI REFERENCES The document retains one ambiguity from RFC 2396: is the zero-length string a valid relative URI reference? The ABNF syntax seems to suggest that it isn't, but sections 4.4 and 5.4.1 assigns semantics to this case, saying this is an "abnormal" case which URI parsers "should" be capable of handling. I think that the use of "" as a relative self-reference should be treated as being wholly respectable. (What does "abnormal" actually mean?) I'm disappointed to see that the term "current document" still appears in section 4.4, and is nowhere defined. In 5.4.2 it appears in residual form as "current base URI". Are "current document" and "current base URI" the same thing as "the resource identified by the base URI"? If so, say so. The section heading of 4.2 is "Relative URI", but in fact a relative URI reference is not a URI, so this term should not be used. In 4.4 the statement "the dereference should not result in a new retrieval" seems to contradict section 1.2.2, which strongly suggests that the semantics of the dereferencing operation are outside the scope of the RFC. ESCAPING It's much clearer now that a string is not a URI unless all the special characters have been properly escaped. Nevertheless, there is still some residual language that hints that the input to the escaping algorithm might also be referred to as a URI. 2.4.2 says "characters within a URI string are escaped". What exactly is a URI string? Similarly, "Once generated, a URI is always in an escaped form" hints that there are other circumstances in which a URI might not be in escaped form. It might be useful to define some formal term for the unescaped representation of a URI, for example a "URI-rendition", so that we can talk about this string without referring to it as a URI. URI EQUIVALENCE The discussion is useful, but it would also be useful to define a preferred (and named) default algorithm for comparing URIs that other specifications can refer to. Michael Kay
Received on Tuesday, 5 August 2003 11:04:11 UTC