- From: <bugzilla@jessica.w3.org>
- Date: Wed, 26 Mar 2014 23:19:57 +0000
- To: public-qt-comments@w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=25149
--- Comment #4 from C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com> ---
Note that the normative reference to "Character Model for the World Wide Web
1.0: Normalization," ed. Addison Phillips, Tex Texin, Richard Ishida, et. al.,
also goes to a non-stable document (in this case a non-last-call draft).  The
text of the spec refers to this document three times:
1 In section 4 "Phases of Serialization" [1], item 3.d of the list contains a
definition of the term "Unicode normalization" which reads in part:
    For specific recommendations for character normalization on the 
    World Wide Web, see [Character Model for the World Wide Web 1.0: 
    Normalization].]
[1] http://www.w3.org/TR/xslt-xquery-serialization-30/#serphases
I think this sentence could if necessary be moved to a note; it expresses no
normative requirement.  (I am divided in my mind whether it SHOULD be moved to
a note; after some thought, I lean toward saying it should stay where it is,
since it's in text clearly marked as the definition of a term.  But we can make
it a note if the WG chooses.)
2 In section 5.1.9 XML Output Method: the normalization-form Parameter [2], one
of the bullet items in normative text reads:
    NFC specifies the serialized result will be in Normalization 
    Form C, using the rules specified in [Character Model for the 
    World Wide Web 1.0: Normalization].
[2] http://www.w3.org/TR/xslt-xquery-serialization-30/#XML_NORMALIZATION-FORM
I may be missing something, but I don't see any special rules for normalization
form C in the Character Model spec that apply to our situation.  At first
glance, what the Character Model spec provides that the Unicode definition of
NFC does not provide is a set of rules for getting there from legacy encodings.
 That can be relevant for a parser, but not for a serializer.
I think the thing to do here is (a) replace the reference to Character Model
with a reference to UAX #15, and (b) add a note pointing to Character Model for
further information and rules for dealing with legacy encodings. 
3 Again in section 5.1.9, another bullet item reads:
    fully-normalized specifies the serialized result will be in 
    fully normalized text, as specified in [Character Model for 
    the World Wide Web 1.0: Normalization].
The term 'fully normalized' is (as far as I can tell) not standard Unicode
terminology, but fortunately (as the note immediately below this passage
suggests) it is defined not only by the Character Model spec but also by the
XML 1.1 spec.  What we should do is (a) replace the reference to Character
Model with a reference to XML 1.1 and (b) add a note pointing to Character
Model for further information and motivation.
-- 
You are receiving this mail because:
You are the QA Contact for the bug.
Received on Wednesday, 26 March 2014 23:19:58 UTC