- From: Martin Duerst <duerst@it.aoyama.ac.jp>
- Date: Fri, 22 Aug 2008 16:57:58 +0900
- To: John Cowan <cowan@ccil.org>, www-international@w3.org
At 08:18 08/08/14, John Cowan wrote: >However, we are considering backporting features of XML Namespaces 1.1 >(which is used exclusively with XML 1.1 documents) to XML Namespaces 1.0 >(which is used exclusively with XML 1.0 documents). The relevant feature >is allowing XML namespace names to be IRIs rather than URIs. > >Point in favor: allowing an IRI permits the namespace name (which is used >only for naming, not for retrieval) to be at least partly meaningful in >languages other than English. Another point in favor: When I last looked, the majority of XML 1.0 implementations I tested (mostly by using a namespace with non-ASCII characters in XSLT) just "did the right thing". For the tests, see http://www.w3.org/2003/02/uriEquivTest/ and http://lists.w3.org/Archives/Public/www-international/2003JanMar/0025.html. Of course, this was "years ago". >Point against: supporting full Unicode allows both visual spoofing and >composed-vs.-decomposed character spoofing of namespace names, possibly >causing a document which appears to be in one namespace to be validated >against the schema for another namespace. Namespace names are compared >using codepoint-by-codepoint equality only, and this will not be changed. We had extensive discussions about similar problems (mostly for element/attribute names) during some work on the normalization part of the character model. I think the schema validation case isn't terribly serious. The way I understand it, a recipient will be validating against a known schema, and if the sender assumed a (normalization-wise) different one, then there will be an error, and that error will in due time be corrected. When we thought about it, we came up mainly with some cases of e.g. some XSLT application selecting e.g. the 7th occurrence of an element 'foo' for some payment amount, and somebody trying to fool a human into thinking that an element with a differently normalized name was the 7th while the processor would pick what would look to the user as the 8th (or some such). Not totally impossible, but rather far-fetched. It would be possible with namespaces, too, but only if two separate namespaces are used, which might already raise suspicion. Come to think about it, similar tricks are already possible by using two prefixes differing only in normalization, because namespace prefixes already allow Unicode (http://www.w3.org/TR/2006/REC-xml-names-20060816/#NT-Prefix). The people using namespaces (as opposed to the people using domain names in web addresses and email addresses) are few and far between, in general with a certain technical expertise. >What do you think? Should we allow IRIs? Yes, very much so. My guess is that the number of usages won't be that high, but there might be some interesting use cases e.g. in the RDF area or in education in particular. Also the cost of allowing it (treating namespace IRIs similar to any other XML data) is actually lower than the cost of not allowing it (special-casing against non-ASCII). Regards, Martin. #-#-# Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University #-#-# http://www.sw.it.aoyama.ac.jp mailto:duerst@it.aoyama.ac.jp
Received on Friday, 22 August 2008 08:00:14 UTC