- From: Joseph Reagle <reagle@w3.org>
- Date: Tue, 19 Feb 2002 14:39:59 -0500
- To: www-tag@w3.org
- Cc: PhillipHallam-Baker <pbaker@verisign.com>, xme <stephen.farrell@baltimore.ie>, Merlin Hughes <merlin@baltimore.ie>, duerst@w3.org
Stephen has asked an interesting question below that I expect will be important to any activity that uses URIs as identifiers in the context of a semantic/security application: when are two URI variants considered identical? My first impulse was to check the XML namespace spec, "[Definition:] URI references which identify namespaces are considered identical when they are exactly the same character-for-character." [a] [a] http://www.w3.org/TR/REC-xml-names/ However, this could benefit from further specificity. What about the following sort of issues? The URI attribute identifies a data object using a URI-Reference, as specified by RFC2396 [URI]. The set of allowed characters for URI attributes is the same as for XML, namely [Unicode]. However, some Unicode characters are disallowed from URI references including all non-ASCII characters and the excluded characters listed in RFC2396 [URI, section 2.4]. However, the number sign (#), percent sign (%), and square bracket characters re-allowed in RFC 2732 [URI-Literal] are permitted. Disallowed characters must be escaped as follows: ... http://www.w3.org/TR/2002/REC-xmldsig-core-20020212/#sec-URI I spoke to TimBL briefly about the question, he enumerated many of the places one might look for equivalence in the "URI stack" *while* stating that clearly one wouldn't want to address all these layers for the complexity and processing required: URI spec string = string HTTP DNS W3.org = w3.org DNS LOOKUP www.w3.org <-- CNAME -- w3.org HTTP REDIRECT /foo --REDIRECT--> /foo/ RDF /foo = /bar Consequently, character by character comparison is probably the most straightforward approach -- assuming one addresses the character encoding issues well. Stephen is presently using "absolute URIs" with RFC2396 equivalence (see below). This seems fairly straightforward as well -- though it says, "if the URI is case insensitive ..." I think it might be useful to specify whether case *is* relevant or not for that app. Any thoughts? Also, my broader question to the TAG is, does this seem like a worthwhile issue to address for all of our specifications? I also expect the validation/augmentation of URIs of type anyURI in schema might also be relevant to this question but haven't thought about it too carefully. [1] On Thursday 14 February 2002 06:01, Stephen Farrell wrote: > ... > The OASIS security committes's [1] SAML spec [2] is about access > control. One of its messages is of the form "can fred see > http://foo.com/stuff" with a minimal answer being "yes/no". > > Now, we're trying to figure a good way to tell implementors not > to fall for the following scenario: > > Q: "can fred see http://foo.com/stuff" A: no > Q: "can fred see HTTP://Foo.COM:80/stuff" A: no > Q: "can fred see http://foo.com/otherstuff/../stuff" A: yes > > Which involves us in giving some guidance for a "canonical > form" or URI, at least for the de-referencable via HTTP > URLs. > > My best bet so far is the following: > > By the "canonical form" of a URI we mean an absolute URI (i.e. no > relative URIs) which is the shortest of all the equivalent URI > strings, where URI equivalence is defined according to [RFC2396]. > For example, the URI "http://foo.com:80/go/../go/to/" is not in > canonical form, but "http://foo.com/go/to" is in canonical form. > Note that if a URI is partly or entirely case-insensitive, then > there will be more than one "canonical form" for that URI such > that a case sensitive matching rule would consider that the > strings differ (e.g. "HTTP://Foo.cOm/go/to" is "another" canonical > form of the URL above). > > > Ta, > Stephen. > > [1] http://www.oasis-open.org/committees/security/ > [2] > http://www.oasis-open.org/committees/security/docs/draft-sstc-core-25.pdf > [RFC2396] ftp://ftp.isi.edu/in-notes/rfc2396.txt -- Joseph Reagle Jr. http://www.w3.org/People/Reagle/ W3C Policy Analyst mailto:reagle@w3.org IETF/W3C XML-Signature Co-Chair http://www.w3.org/Signature/ W3C XML Encryption Chair http://www.w3.org/Encryption/2001/
Received on Tuesday, 19 February 2002 14:40:06 UTC