- From: Biron,Paul V <Paul.V.Biron@kp.org>
- Date: Tue, 19 Mar 2002 14:54:42 -0800
- To: "'www-xml-schema-comments'" <www-xml-schema-comments@w3.org>
The message below was posted to the TAG list and points out that we fail to say how equiv comparisons are performed on anyURI (e.g., for checking a literal against an enumeration). I'd also note that we don't say anything of this kind about a lot of types (string, etc.). We rely on phrases like "if the {value} is in the value space...". I suggest we do what we can in this regard as errata and a more formal approach to this should be added to our candiate requirements for 1.1. pvb > -----Original Message----- > From: David Orchard [SMTP:david.orchard@bea.com] > Sent: Tuesday, March 19, 2002 10:05 AM > To: www-tag@w3.org > Subject: RE: "canonical" URIs > > TAG members, > > I don't see URI comparison officially listed as a TAG issue. I'd like > Joseph/Stephen's issue added to the TAG issues list. > > Equivalence rules for URIs are defined by the URI scheme. HTTP has a > section on URI comparison. > > However, XML does not have a default comparison function for the XML > Schema > anyURI data type. I think a reasonable approach would be to say that the > default comparision function for anyURI is to use the HTTP URI comparison > algorithm, but that it is overridable by any scheme. > > Cheers, > Dave > > > > -----Original Message----- > > From: www-tag-request@w3.org > > [mailto:www-tag-request@w3.org]On Behalf Of > > Joseph Reagle > > Sent: Tuesday, February 19, 2002 11:40 AM > > To: www-tag@w3.org > > Cc: PhillipHallam-Baker; xme; Merlin Hughes; duerst@w3.org > > Subject: Re: "canonical" URIs > > > > > > > > Stephen has asked an interesting question below that I expect will be > > important to any activity that uses URIs as identifiers in > > the context of > > a semantic/security application: when are two URI variants considered > > identical? > > > > My first impulse was to check the XML namespace spec, > > "[Definition:] URI > > references which identify namespaces are considered identical > > when they are > > exactly the same character-for-character." [a] > > > > [a] http://www.w3.org/TR/REC-xml-names/ > > > > However, this could benefit from further specificity. What about the > > following sort of issues? > > > > The URI attribute identifies a data object using a URI-Reference, > > as specified by RFC2396 [URI]. The set of allowed characters for > > URI attributes is the same as for XML, namely [Unicode]. However, > > some Unicode characters are disallowed from URI references > > including all non-ASCII characters and the excluded characters > > listed in RFC2396 [URI, section 2.4]. However, the number sign (#), > > percent sign (%), and square bracket characters re-allowed > > in RFC 2732 > > [URI-Literal] are permitted. Disallowed characters must be > > escaped as > > follows: ... > > http://www.w3.org/TR/2002/REC-xmldsig-core-20020212/#sec-URI > > > > I spoke to TimBL briefly about the question, he enumerated > > many of the > > places one might look for equivalence in the "URI stack" > > *while* stating > > that clearly one wouldn't want to address all these layers for the > > complexity and processing required: > > URI spec > > string = string > > HTTP DNS > > W3.org = w3.org > > DNS LOOKUP > > www.w3.org <-- CNAME -- w3.org > > HTTP REDIRECT > > /foo --REDIRECT--> /foo/ > > RDF > > /foo = /bar > > > > Consequently, character by character comparison is probably the most > > straightforward approach -- assuming one addresses the > > character encoding > > issues well. > > > > Stephen is presently using "absolute URIs" with RFC2396 > > equivalence (see > > below). This seems fairly straightforward as well -- though > > it says, "if > > the URI is case insensitive ..." I think it might be useful > > to specify > > whether case *is* relevant or not for that app. Any thoughts? > > > > Also, my broader question to the TAG is, does this seem like > > a worthwhile > > issue to address for all of our specifications? I also expect the > > validation/augmentation of URIs of type anyURI in schema > > might also be > > relevant to this question but haven't thought about it too carefully. > > > > [1] On Thursday 14 February 2002 06:01, Stephen Farrell wrote: > > > ... > > > The OASIS security committes's [1] SAML spec [2] is about access > > > control. One of its messages is of the form "can fred see > > > http://foo.com/stuff" with a minimal answer being "yes/no". > > > > > > Now, we're trying to figure a good way to tell implementors not > > > to fall for the following scenario: > > > > > > Q: "can fred see http://foo.com/stuff" A: no > > > Q: "can fred see HTTP://Foo.COM:80/stuff" A: no > > > Q: "can fred see http://foo.com/otherstuff/../stuff" A: yes > > > > > > Which involves us in giving some guidance for a "canonical > > > form" or URI, at least for the de-referencable via HTTP > > > URLs. > > > > > > My best bet so far is the following: > > > > > > By the "canonical form" of a URI we mean an absolute URI (i.e. no > > > relative URIs) which is the shortest of all the equivalent URI > > > strings, where URI equivalence is defined according to [RFC2396]. > > > For example, the URI "http://foo.com:80/go/../go/to/" is not in > > > canonical form, but "http://foo.com/go/to" is in canonical form. > > > Note that if a URI is partly or entirely case-insensitive, then > > > there will be more than one "canonical form" for that URI such > > > that a case sensitive matching rule would consider that the > > > strings differ (e.g. "HTTP://Foo.cOm/go/to" is "another" > > canonical > > > form of the URL above). > > > > > > > > > Ta, > > > Stephen. > > > > > > [1] http://www.oasis-open.org/committees/security/ > > > [2] > > > > > http://www.oasis-open.org/committees/security/docs/draft-sstc- > > core-25.pdf > > > [RFC2396] ftp://ftp.isi.edu/in-notes/rfc2396.txt > > > > -- > > > > Joseph Reagle Jr. http://www.w3.org/People/Reagle/ > > W3C Policy Analyst mailto:reagle@w3.org > > IETF/W3C XML-Signature Co-Chair http://www.w3.org/Signature/ > > W3C XML Encryption Chair http://www.w3.org/Encryption/2001/ > > > >
Received on Tuesday, 19 March 2002 18:15:30 UTC