- From: Martin Duerst <duerst@it.aoyama.ac.jp>
- Date: Fri, 22 Jun 2007 19:38:27 +0900
- To: John Cowan <cowan@ccil.org>, Addison Phillips <addison@yahoo-inc.com>
- Cc: "Grosso, Paul" <pgrosso@ptc.com>, public-iri@w3.org, www-xml-linking-comments@w3.org, public-xml-core-wg@w3.org, public-i18n-core@w3.org
Hello John, At 06:42 07/06/21, John Cowan wrote: >Addison Phillips scripsit: > >> I'm concerned about this discussion. I note that it has been a long >> standing (perhaps mythological) belief by many of us in the >> internationalization activity that XLink, XML Base, et al, represented >> an instance of IRI. > >It's always been true that random ASCII characters that are forbidden >in URI/IRIs have "worked" in XML system identifiers, as well as the >other things derived from it. That didn't turn out to be what IRIs >are -- they have the same restrictions within the ASCII repertoire >as IRIs. I guess you ment "URIs" in the last line. This is true, and is also true for HTML. There are several ways to explain this: - Implementers carefully implemented the spec. - Implementers did what worked with the least effort. - Implementers understood that it's a well-held principle for URIs and IRIs that there shouldn't (or can't) be any detailled syntax checks. About the only thing you can check reliably without going down the scheme specific road is that if it contains a ':', then the characters before the first ':' need to match the scheme production: scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." ) I think the XML Schema WG tried to come up with a regexp, but they gave up. Please also see the following note at http://www.w3.org/TR/xmlschema-2/#anyURI Note: Each URI scheme imposes specialized syntax rules for URIs in that scheme, including restrictions on the syntax of allowed fragment identifiers. Because it is impractical for processors to check that a value is a context-appropriate URI reference, this specification follows the lead of [RFC 2396] (as amended by [RFC 2732]) in this matter: such rules and restrictions are not part of type validity and are not checked by ・minimally conforming・ processors. Thus in practice the above definition imposes only very modest obligations on ・minimally conforming・ processors. >This is quite independent of the status of SPACE. Can you explain how this is independent? Isn't space just one of these characters? Regards, Martin. #-#-# Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University #-#-# http://www.sw.it.aoyama.ac.jp mailto:duerst@it.aoyama.ac.jp
Received on Friday, 22 June 2007 10:40:28 UTC