Re: [Minutes] 14 Apr 2003 TAG teleconf (URIEquivalence-15, IRIEverywhere-27, xmlIDSemantics-32, abstractComponentRefs-37, namespaceDocument-8) from Tim Bray on 2003-04-15 (www-tag@w3.org from April 2003)

From: Tim Bray <tbray@textuality.com>
Date: Tue, 15 Apr 2003 14:55:03 -0700
To: Chris Lilley <chris@w3.org>
Cc: "Ian B. Jacobs" <ij@w3.org>, www-tag@w3.org
Message-ID: <3E9C7FB7.4050601@textuality.com>

Chris Lilley wrote:

> I would not like people to get the impression from reading these
> minutes that i am in favour of 'canonicalizing' IRIs by hexifying
> them. Like Martin says and like the IRI spec says, only do this as a
> last resort when using antiquated transport protocols. Better is to
> use whatever method (quoted-unreadable, base64, ncr, \u) the
> environment provides to preserve the original characters.

I think I agree, but that last sentence is potentially very misleading. 
  If the IRI is embedded in an XML document, the IRI's Unicode 
characters should appear in the infoset as themselves and *only* as 
themselves unless they IRI/URI-special characters like '#' or '%', in 
which case they should appear *only* as %-escapes.   In the XML 
instance, this may be accomplished by having them appear as themselves 
(if you're using an encoding that supports them) or via NCRs.

In the XML context, other mechanisms such as \u or base64 should not be 
used.

Right?

-- 
Cheers, Tim Bray
         (ongoing fragmented essay: http://www.tbray.org/ongoing/)

Received on Tuesday, 15 April 2003 17:55:18 UTC