- From: Tim Bray <tbray@textuality.com>
- Date: Mon, 14 Apr 2003 16:30:42 -0700
- To: WWW-Tag <www-tag@w3.org>, uri@w3.org
The TAG has two issues on its plate, http://www.w3.org/2001/tag/ilist#URIEquivalence-15 (essentially: when are two URIs considered equivalent?) and http://www.w3.org/2001/tag/ilist#IRIEverywhere-27 (essentially: what should we do about IRIs?). I, with lots of TAG input, drafted some text on the comparison issue, most of which has now made it into the latest draft of the RFC2396 revision at http://www.apache.org/~fielding/uri/rev-2002/rfc2396bis.html. We've spun our wheels on this quite a bit and failed to record much progress. However, I feel that we do have quite a bit of consensus lurking and we can move forward on these issues. I suspect that we agree on the following: 1. In response to the basic question asked by Jonathan Marsh et al in Issue 27, the TAG answers, first of all, "Yes". That is, we believe that it is important that Web identifiers be able to use non-ASCII characters natively and straightforwardly, and that the IRI work (see http://www.w3.org/International/iri-edit/) is sound and is making good progress. That said, the draft is not yet stable or finalized, and we agree with the concern addressed by Marsh et al about the risk involved when referencing unstable standards. As of now, both XML 1.* and XML Schema's "AnyURI" work define a construct where IRIs may be used, and the benefit seems to justify the risk. 2. The TAG notes that, with the blessing of the XML Namespaces recommendation, some software is observed to take decisions about URI equivalence on the basis of strcmp() or its equivalent. This is widespread enough that it's not going to go away. 3. The TAG urges both spec and software authors to familiarize themselves with the issues around URI comparison as laid out in RFC2396bis, specifically http://www.apache.org/~fielding/uri/rev-2002/rfc2396bis.html#rfc.section.6 4. Because of the prevalence of simple string comparison of URIs, and because of the Web Architectural principle that consistency in naming is important, the TAG urges creators of URIs to create them in as canonical a form as possible. Section 6.3 of the RFC2396bis draft provides rules for this that are applicable both to URIs and IRIs. This will have the beneficial effect that strcmp() will be an accurate (and very cheap) equivalence test. ============================================================= Following on from this, TimBL keeps raising the importance of coherent round-tripping IRI->URI->IRI, but I've not quite managed to grasp the core of that issue fully enough to tell whether we really have consensus; Tim, any chance of outlining that one in writing or have you already? ============================================================== We can't close our issue 15 until the RFC2396 redrafting is finished, but given the above, I think we can close #27. -- Cheers, Tim Bray (ongoing fragmented essay: http://www.tbray.org/ongoing/)
Received on Monday, 14 April 2003 19:30:49 UTC