- From: <bugzilla@jessica.w3.org>
- Date: Wed, 26 Mar 2014 23:19:57 +0000
- To: public-qt-comments@w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=25149 --- Comment #4 from C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com> --- Note that the normative reference to "Character Model for the World Wide Web 1.0: Normalization," ed. Addison Phillips, Tex Texin, Richard Ishida, et. al., also goes to a non-stable document (in this case a non-last-call draft). The text of the spec refers to this document three times: 1 In section 4 "Phases of Serialization" [1], item 3.d of the list contains a definition of the term "Unicode normalization" which reads in part: For specific recommendations for character normalization on the World Wide Web, see [Character Model for the World Wide Web 1.0: Normalization].] [1] http://www.w3.org/TR/xslt-xquery-serialization-30/#serphases I think this sentence could if necessary be moved to a note; it expresses no normative requirement. (I am divided in my mind whether it SHOULD be moved to a note; after some thought, I lean toward saying it should stay where it is, since it's in text clearly marked as the definition of a term. But we can make it a note if the WG chooses.) 2 In section 5.1.9 XML Output Method: the normalization-form Parameter [2], one of the bullet items in normative text reads: NFC specifies the serialized result will be in Normalization Form C, using the rules specified in [Character Model for the World Wide Web 1.0: Normalization]. [2] http://www.w3.org/TR/xslt-xquery-serialization-30/#XML_NORMALIZATION-FORM I may be missing something, but I don't see any special rules for normalization form C in the Character Model spec that apply to our situation. At first glance, what the Character Model spec provides that the Unicode definition of NFC does not provide is a set of rules for getting there from legacy encodings. That can be relevant for a parser, but not for a serializer. I think the thing to do here is (a) replace the reference to Character Model with a reference to UAX #15, and (b) add a note pointing to Character Model for further information and rules for dealing with legacy encodings. 3 Again in section 5.1.9, another bullet item reads: fully-normalized specifies the serialized result will be in fully normalized text, as specified in [Character Model for the World Wide Web 1.0: Normalization]. The term 'fully normalized' is (as far as I can tell) not standard Unicode terminology, but fortunately (as the note immediately below this passage suggests) it is defined not only by the Character Model spec but also by the XML 1.1 spec. What we should do is (a) replace the reference to Character Model with a reference to XML 1.1 and (b) add a note pointing to Character Model for further information and motivation. -- You are receiving this mail because: You are the QA Contact for the bug.
Received on Wednesday, 26 March 2014 23:19:58 UTC