- From: Misha Wolf <Misha.Wolf@reuters.com>
- Date: Thu, 05 Dec 2002 12:49:30 +0000
- To: www-international@w3.org
fyi. Misha -----Original Message----- From: jeremy carroll [mailto:jjc@hpl.hp.com] Sent: 04 December 2002 11:09 To: www-tag@w3.org Subject: IRI-Everywhere (was RE: last call comments, usage of IRI rather than URI) Julian Reschke wrote: > This issue could > *probably* solved by explicitly forbidding those ASCII characters in > namespace names which have been forbidden in URIs as well I am increasingly convinced by the case Julian has been making that IRIs should treat 7-bit ascii characters according to the URI specs. I had some test data connected with the treatment of the 'excluded' characters. That all of the following are relative IRIs is, at least a little, surprising: (I use XML attribute notation, CDATA attribute value normalization) "<b>b" "	" " " " " "{" "\" These characters are excluded in RFC 2396 because other systems use them. While in XML this perhaps is not an issue, with any interoperation between XML and other systems, it becomes difficult. An example that came up yesterday was mapping such relative IRIs out of RDF/XML into an RDF graph (in memory) and then out into N3. The N3 grammar expects to be able to use < and > as delimiters, as indicated in RFC 2396. I found myself unable to defend this treatment to a colleague whose N3 parser was having difficulty. Is there a case as to why IRIs differ from URIs on the ascii subset? Or is it, essentially, an historical accident? Jeremy Carroll Appendix - Sample normative text: [[ The characters to be escaped are the contol characters #x0 to #x1F and #x7F (most of which cannot appear in XML), space #x20, the delimiters '<' #x3C, '>' #x3E and '"' #x22, the unwise characters '{' #x7B, '}' #x7D, '|' #x7C, '\' #x5C, '^' #x5E and '`' #x60, as well as all characters above #x7F. ]] http://www.w3.org/XML/xml-V10-2e-errata#E26 equivalently [[ the disallowed characters include all non-ASCII characters, plus the excluded characters listed in Section 2.4 of [IETF RFC 2396], except for the number sign (#) and percent sign (%) and the square bracket characters re-allowed in [IETF RFC 2732]. Disallowed characters must be escaped ]] http://www.w3.org/TR/xlink/#link-locators -- alternative text, URI compatible [[ The characters to be escaped are all characters above #x80. ]] or [[ the disallowed characters are all non-ASCII characters. Disallowed characters must be escaped ]] ------------------------------------------------------------- --- Visit our Internet site at http://www.reuters.com Get closer to the financial markets with Reuters Messaging - for more information and to register, visit http://www.reuters.com/messaging Any views expressed in this message are those of the individual sender, except where the sender specifically states them to be the views of Reuters Ltd.
Received on Thursday, 5 December 2002 07:57:41 UTC