W3C home > Mailing lists > Public > www-tag@w3.org > July 2002

Re: URIEquivalence-15 and IRIs

From: Elliotte Rusty Harold <elharo@metalab.unc.edu>
Date: Tue, 9 Jul 2002 11:53:58 -0400
Message-Id: <p04330100b950b5b998a1@[]>
To: Tim Bray <tbray@textuality.com>
Cc: w3c-i18n-ig@w3.org, www-tag@w3.org

At 8:34 AM -0700 7/9/02, Tim Bray wrote:
>The namespaces spec says "character by character" largely because 
>this issue hadn't occurred to us at that time.  I think that we 
>could choose to reinterpret the term right now with generally 
>beneficial effect and with minimal software breakage.

I'm not so sure. Every namespace testing function I've ever written 
(and there've been a lot) has done the test as straight string 
comparison function using something like Java's String.equals() 
method. I didn't even use equalsIgnoreCase() because I knew that at 
least in the path part of the string case was significant.

I could be wrong, but I tend to doubt that most implementers have 
been any more careful than me. You'd have to know more than average 
about namespaces and URLs to even realize there's a problem with 
this, and then when you went looking to the specs to figure out what 
to do, what you see is character by character comparison. The 
namespaces spec is explicit that not all equivalent URIs are 
equivalent namespace names.

I am in complete agreement that the namespaces spec should have done 
a better job of this originally. However, barring a time machine, I 
doubt it can be changed now. Most major namespace URIs from the W3C 
stick to unproblematic ASCII, but I have seen other namespace URIs in 
the wild that use escapable characters, particularly spaces. I also 
suspect that in reverse this would open up up flaws and attacks on 
existing systems by feeding them documents in which some characters 
in the namespace URIs had been deliberately escaped.

This is a breaking change. If it's to be added it needs a new 
version, and a means of determining when the new version is in use. 
It is not something that should be forced into the existing spec 
through an erratum.

| Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
|          XML in a  Nutshell, 2nd Edition (O'Reilly, 2002)          |
|              http://www.cafeconleche.org/books/xian2/              |
|  http://www.amazon.com/exec/obidos/ISBN%3D0596002920/cafeaulaitA/  |
|  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      |
|  Read Cafe con Leche for XML News: http://www.cafeconleche.org/    |
Received on Tuesday, 9 July 2002 11:56:00 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:55:52 UTC