W3C home > Mailing lists > Public > www-tag@w3.org > December 2002

Re: IRI-Everywhere (was RE: last call comments, usage of IRI rather than URI)

From: Martin Duerst <duerst@w3.org>
Date: Fri, 06 Dec 2002 07:29:18 +0900
Message-Id: <4.2.0.58.J.20021206071819.0523b780@localhost>
To: "jeremy carroll" <jjc@hpl.hp.com>, <www-tag@w3.org>
Cc: Michel Suignard <michelsu@microsoft.com>, www-international@w3.org

Hello Jeremy,

The main reason that spaces,... got included in IRIs was XPointer.
Please note that "&#9;" and "&#10;" are not legal IRIs, at least
not according to the current IRI spec. And we have definitely no
plan to add them.

Regards,   Martin.

At 12:09 02/12/04 +0100, jeremy carroll wrote:

>Julian Reschke wrote:
> > This issue could
> > *probably* solved by explicitly forbidding those ASCII characters in
> > namespace names which have been forbidden in URIs as well
>
>I am increasingly convinced by the case Julian has been making that IRIs
>should treat 7-bit ascii characters according to the URI specs.
>
>I had some test data connected with the treatment of the 'excluded'
>characters.
>
>That all of the following are relative IRIs is, at least a little,
>surprising:
>(I use XML attribute notation, CDATA attribute value normalization)
>
>"&lt;b&gt;b"
>"&#9;"
>"&#10;"
>"   "
>"{"
>"\"
>
>These characters are excluded in RFC 2396 because other systems use them.
>While in XML this perhaps is not an issue, with any interoperation between
>XML and other systems, it becomes difficult.
>
>An example that came up yesterday was mapping such relative IRIs out of
>RDF/XML into an RDF graph (in memory) and then out into N3.
>The N3 grammar expects to be able to use < and > as delimiters, as indicated
>in RFC 2396.
>
>I found myself unable to defend this treatment to a colleague whose N3
>parser was having difficulty.
>
>Is there a case as to why IRIs differ from URIs on the ascii subset?
>Or is it, essentially, an historical accident?
>
>Jeremy Carroll
>
>Appendix - Sample normative text:
>[[
>The characters to be escaped are the contol characters #x0 to #x1F and #x7F
>(most of which cannot appear in XML), space #x20, the delimiters '<' #x3C,
>'>' #x3E and '"' #x22, the unwise characters '{' #x7B, '}' #x7D, '|' #x7C,
>'\' #x5C, '^' #x5E and '`' #x60, as well as all characters above #x7F.
>]]
>http://www.w3.org/XML/xml-V10-2e-errata#E26
>
>equivalently
>[[
>the disallowed characters include all non-ASCII characters, plus the
>excluded characters listed in Section 2.4 of [IETF RFC 2396], except for the
>number sign (#) and percent sign (%) and the square bracket characters
>re-allowed in [IETF RFC 2732]. Disallowed characters must be escaped
>]]
>http://www.w3.org/TR/xlink/#link-locators
>
>-- alternative text, URI compatible
>
>[[
>The characters to be escaped are all characters above #x80.
>]]
>or
>[[
>the disallowed characters are all non-ASCII characters. Disallowed
>characters must be escaped
>]]
>
>
>
>
>
>
>
>
Received on Thursday, 5 December 2002 18:03:14 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:47:14 GMT