- From: Bjoern Hoehrmann <derhoermi@gmx.net>
- Date: Thu, 30 Sep 2010 23:48:53 +0200
- To: "Phillips, Addison" <addison@lab126.com>
- Cc: "public-iri@w3.org" <public-iri@w3.org>
* Phillips, Addison wrote: >1. Change the text above to read: > > If the IRI or IRI reference is an octet stream in some known non- > Unicode character encoding, convert the IRI to a sequence of > characters from the UCS. > > In other cases (written on paper, read aloud, or otherwise > represented independent of any character encoding) represent the IRI > as a sequence of characters from the UCS. IRIs are by definition a sequence of characters from the UCS. With the requirement gone, I do not think there is a point in having this section in the document. >2. Add the following text just after the second paragraph above: > >NOTE: Some character encodings or transcriptions can be converted to or >represented by more than one sequence of Unicode characters. Ideally the >resulting IRI would use a normalized form, such as Unicode Normalization >Form C (NFC, [UTR15]), since that ensures a stable, consistent >representation that is most likely to produce the intended results. >Implementers and users are cautioned that, while denormalized character >sequences are valid, they might be difficult for other users or >processes to guess and might produce unexpected results. Normalization is already discussed in 5.3.2.2 "Character Normalization", any discussion of it should be moved there if it's not already covered. -- Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de 25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
Received on Thursday, 30 September 2010 23:36:09 UTC