- From: <bugzilla@jessica.w3.org>
- Date: Wed, 26 Mar 2014 23:19:57 +0000
- To: public-qt-comments@w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=25149
--- Comment #4 from C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com> ---
Note that the normative reference to "Character Model for the World Wide Web
1.0: Normalization," ed. Addison Phillips, Tex Texin, Richard Ishida, et. al.,
also goes to a non-stable document (in this case a non-last-call draft). The
text of the spec refers to this document three times:
1 In section 4 "Phases of Serialization" [1], item 3.d of the list contains a
definition of the term "Unicode normalization" which reads in part:
For specific recommendations for character normalization on the
World Wide Web, see [Character Model for the World Wide Web 1.0:
Normalization].]
[1] http://www.w3.org/TR/xslt-xquery-serialization-30/#serphases
I think this sentence could if necessary be moved to a note; it expresses no
normative requirement. (I am divided in my mind whether it SHOULD be moved to
a note; after some thought, I lean toward saying it should stay where it is,
since it's in text clearly marked as the definition of a term. But we can make
it a note if the WG chooses.)
2 In section 5.1.9 XML Output Method: the normalization-form Parameter [2], one
of the bullet items in normative text reads:
NFC specifies the serialized result will be in Normalization
Form C, using the rules specified in [Character Model for the
World Wide Web 1.0: Normalization].
[2] http://www.w3.org/TR/xslt-xquery-serialization-30/#XML_NORMALIZATION-FORM
I may be missing something, but I don't see any special rules for normalization
form C in the Character Model spec that apply to our situation. At first
glance, what the Character Model spec provides that the Unicode definition of
NFC does not provide is a set of rules for getting there from legacy encodings.
That can be relevant for a parser, but not for a serializer.
I think the thing to do here is (a) replace the reference to Character Model
with a reference to UAX #15, and (b) add a note pointing to Character Model for
further information and rules for dealing with legacy encodings.
3 Again in section 5.1.9, another bullet item reads:
fully-normalized specifies the serialized result will be in
fully normalized text, as specified in [Character Model for
the World Wide Web 1.0: Normalization].
The term 'fully normalized' is (as far as I can tell) not standard Unicode
terminology, but fortunately (as the note immediately below this passage
suggests) it is defined not only by the Character Model spec but also by the
XML 1.1 spec. What we should do is (a) replace the reference to Character
Model with a reference to XML 1.1 and (b) add a note pointing to Character
Model for further information and motivation.
--
You are receiving this mail because:
You are the QA Contact for the bug.
Received on Wednesday, 26 March 2014 23:19:58 UTC