Re: IRIEverywhere-27 from Bjoern Hoehrmann on 2005-12-13 (www-tag@w3.org from December 2005)

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Tue, 13 Dec 2005 19:10:03 +0100
To: ht@inf.ed.ac.uk (Henry S. Thompson)
Cc: www-tag@w3.org
Message-ID: <btutp1pa3msmgtnq43111fp5v195s6ka24@hive.bjoern.hoehrmann.de>

* Henry S. Thompson wrote:
>Precisely.  IRI-to-URI processing for XML is only coherently
>understood as a process that is defined in the terms provided by the
>Infoset spec., where _all_ values are sequences of Unicode code
>points.

Well, then you have this step,

  If the IRI is written on paper, read aloud, or otherwise represented
  as a sequence of characters independent of any character encoding,
  represent the IRI as a sequence of characters from the UCS normalized
  according to Normalization Form C (NFC, [UTR15]).

I.e., you always normalize. Could you elaborate on why e.g. the XML Core
Working Group did not adopt this step in the various specifications that
define string-to-URI conversion (XML 1.0, XML 1.1, XInclude, XLink, ..)?
The normalization step has been in the various IRI drafts for more than
7 years now and

  The XML Core WG would also like TAG input on the wisdom of early
  adoption given the "Internet Draft" status of the IRI draft [10]. So
  far adoption has relied on "copy and paste", but there is potential
  for these definitions to get out of sync.

out of sync specifications were a concern when the issue was raised.
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/

Received on Tuesday, 13 December 2005 18:10:09 UTC