- From: Tim Bray <tbray@textuality.com>
- Date: Tue, 15 Apr 2003 14:55:03 -0700
- To: Chris Lilley <chris@w3.org>
- Cc: "Ian B. Jacobs" <ij@w3.org>, www-tag@w3.org
Chris Lilley wrote: > I would not like people to get the impression from reading these > minutes that i am in favour of 'canonicalizing' IRIs by hexifying > them. Like Martin says and like the IRI spec says, only do this as a > last resort when using antiquated transport protocols. Better is to > use whatever method (quoted-unreadable, base64, ncr, \u) the > environment provides to preserve the original characters. I think I agree, but that last sentence is potentially very misleading. If the IRI is embedded in an XML document, the IRI's Unicode characters should appear in the infoset as themselves and *only* as themselves unless they IRI/URI-special characters like '#' or '%', in which case they should appear *only* as %-escapes. In the XML instance, this may be accomplished by having them appear as themselves (if you're using an encoding that supports them) or via NCRs. In the XML context, other mechanisms such as \u or base64 should not be used. Right? -- Cheers, Tim Bray (ongoing fragmented essay: http://www.tbray.org/ongoing/)
Received on Tuesday, 15 April 2003 17:55:18 UTC