- From: Martin Duerst <duerst@w3.org>
- Date: Mon, 17 Feb 2003 15:56:46 -0500
- To: Chris Lilley <chris@w3.org>
- Cc: www-international@w3.org, www-tag@w3.org, Michel Suignard <michelsu@microsoft.com>
Hello Chris, Continuing our discussion from a while ago. At 03:57 03/02/06 +0100, Chris Lilley wrote: >On Wednesday, February 5, 2003, 2:20:53 AM, Martin wrote: >MD> XML is perfectly capable of representing FOO under >MD> all circumstances without %-escapes, by using the characters directly, >MD> or using numeric character references. > >Yes, it is perfectly capable. My point was that if IRI to URI was a >one way trip then people would ignore the fact that the IRI could >easily be represented in XML; *for portability* and * to be robust* >to guard against any hex escaping that might happen later down the >road they would always fully escape the IRI to start with. I'm not sure at all that this will happen. In particular, for namespaces, I don't expect people to rush for IRIs in the first place (although that shouldn't be an argument for not thinking things through carefully). Also, I don't expect people to go for %ab%cd%ef just for the satisfaction of having, in theory, used character FOO in their namespace. They rather would use an ASCII transcription, let's call it 'phoo' in our example. (but that again isn't an argument, because it still means they don't use their own character FOO). But what I think is most important is to point out the difference between namespaces and 'your arbitrary Web address'. Let's look at an example. The XSLT namespace is 'http://www.w3.org/1999/XSL/Transform'. This can also be used, in a browser, to get to a page that says "Someday a schema for XSL Transforms will live here" (or "This is an XML namespace defined in the XSL Transformations (XSLT) Version 1.0 specification." or "This is the XSLT namespace." Which one you get depends on your browser's setting. In addition, I can also use http://wWw.w3.org/1999/XSL/Transform or Http://www.w3.org/1999/XSL/Transform or any number of other variants to get to the same page. I can even use http://www.w3.org/1999/XSL/TRansform to get a 'multiple choices' (and a while ago even this lead directly to a page). I can also use http://www.w3.org/1999/XSL/Transform.html or http://www.w3.org/1999/XSL/Transform.xsd or http://www.w3.org/1999/XSL/Transform.txt and will get a particular version of the text claiming that it's the XSLT namespace. Of course, all these would be lies, only http://www.w3.org/1999/XSL/Transform identifies the XSLT namespace. However, I have never heard any major complaints about their XSL stylesheets not working because the above variants didn't work. (I'm not saying nobody ever made such a mistake, but people in the XML area are already pretty used to be careful about upper/lower case). Translating the above experiences to IRIs and our example, I think it's not too difficult to expect that people will understand that a namespace 'http://example.org/FOO'. could also be written in XML as 'http://example.org/oo;', if necessary, but not otherwise, and that when they describe the namespace, they would say: The namespace for these elements/attributes is http://example.org/FOO If you are unable to enter FOO in your XML document, you can always use http://example.org/oo; instead. In a browser, you could try 'http://example.org/FOO' or 'Http://example.org/FOO' or 'http://exaMple.org/FOO' or 'http://example.org/%ab%cd%ef' or 'http://example.org/%AB%CD%EF'. Very old browsers (Netscape/IE 4, older versions of Opera and Amaya) would only work with the later two, and somewhat newer browsers would work with all of them but display some of the later two. Even newer browser would work with all of them but display the first one. Please note that analogous to this, you can get to the document(s) at http://www.w3.org/1999/XSL/Transform by using http://www.w3.org/%31%39%39%39/XSL/Transform, and on some browsers even with http://www.w%33.org/1999/XSL/Transform. I very much think that the TAG finding on URI Equivalence should state that resolution mechanisms have to guarantee the equivalence of things such as ~ == %7e == %7E. This is definitely true for http and other things I know, and another good example that is resolution-related is http://www.oasis-open.org/committees/entity/specs/cs-entity-xml-catalogs-1.0. html#sysid-norm (which defines %-equivalence for characters outside the URI repertoire, but not (it probably should) for non- reserved characters in the URI repertoire). So the overall conclusion would be: - Codepoint-by-codepoint equivalence for namespaces and things with similar requirements (identifiers, fast processing). - %-equivalence (i.e. ~ == %7e == %7E and similar) as a minimal requirement for resolution. >MD> But even >MD> in those cases, e.g. to tell somebody with an old email client >MD> (like myself) how to use a particular namespace, it's always >MD> possible to say: "well, just use xmlns:foo='http://example.org/oo;'." > >As long as i told you in an XML message and not, for example, plain >text. The important thing is not whether the message is XML or not, even plain text can carry http://example.org/oo;, as this message shows. The important thing is whether the that namespace will be put into XML source or not. Regards, Martin.
Received on Monday, 17 February 2003 19:54:30 UTC