W3C home > Mailing lists > Public > www-tag@w3.org > April 2003

Re: [Minutes] 14 Apr 2003 TAG teleconf (URIEquivalence-15, IRIEverywhere-27, xmlIDSemantics-32, abstractComponentRefs-37, namespaceDocument-8)

From: Tim Bray <tbray@textuality.com>
Date: Tue, 15 Apr 2003 14:55:03 -0700
Message-ID: <3E9C7FB7.4050601@textuality.com>
To: Chris Lilley <chris@w3.org>
Cc: "Ian B. Jacobs" <ij@w3.org>, www-tag@w3.org

Chris Lilley wrote:

> I would not like people to get the impression from reading these
> minutes that i am in favour of 'canonicalizing' IRIs by hexifying
> them. Like Martin says and like the IRI spec says, only do this as a
> last resort when using antiquated transport protocols. Better is to
> use whatever method (quoted-unreadable, base64, ncr, \u) the
> environment provides to preserve the original characters.

I think I agree, but that last sentence is potentially very misleading. 
  If the IRI is embedded in an XML document, the IRI's Unicode 
characters should appear in the infoset as themselves and *only* as 
themselves unless they IRI/URI-special characters like '#' or '%', in 
which case they should appear *only* as %-escapes.   In the XML 
instance, this may be accomplished by having them appear as themselves 
(if you're using an encoding that supports them) or via NCRs.

In the XML context, other mechanisms such as \u or base64 should not be 


Cheers, Tim Bray
         (ongoing fragmented essay: http://www.tbray.org/ongoing/)
Received on Tuesday, 15 April 2003 17:55:18 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:55:58 UTC