- From: Felix Sasaki <fsasaki@w3.org>
- Date: Wed, 21 Dec 2005 15:02:22 +0900
- To: "Jim Melton" <jim.melton@acm.org>, www-i18n-comments@w3.org
- Cc: w3c-xsl-query@w3.org, "member-i18n-core@w3.org" <member-i18n-core@w3.org>
Jim, Thank you very much for your comments. The i18n core wg is already in "holiday", but we will discuss your comments at the beginning of next year. Have a merry christmas and a good new year. Regards, Felix. On Wed, 21 Dec 2005 08:57:44 +0900, Jim Melton <jim.melton@acm.org> wrote: > Gentlepeople, > > I have found a few cycles to review the Working Draft of the Character > Model for the World Wide Web 1.0: Normalization, (hereinafter > "Normalization") dated 27 October, 2005. These comments are personal, > and > do not necessarily represent the opinions of the XML Query Working Group, > the XSL Working Group, or Oracle Corp. If some or all of these comments > are endorsed by any of those organizations, then you will receive them > separately as comments from the appropriate organization. > > (1) In section 2, Conformance, the list of specification conformance > criteria include: "make it a conformance requirement for implementations > to > conform to this document", and "make it a conformance requirement for > content to conform to this document". Would you clarify (perhaps only > as a > response to this message) whether or not the XQuery 1.0, XPath 2.0, and > XSLT 2.0 suite of specifications would be cited as non-conforming to this > specification if (as I believe to be the case) they do not contain an > explicit statement of those two criteria? > > (2) In section 3.2.3, Include-normalized text, bullet 2 uses the phrase > "clause 1 above". I believe that most readers will better understand > your > meaning if you replace that with "bullet 1 above" or "list item 1 > above". To many readers, the word "clause" refers either to a major > subdivision of a document (e.g., a chapter) or to a relatively short > phrase > such as a portion of a sentence (e.g., the noun clause). > > (3) In section 3.2.4, Fully-normalized text, first numbered list, bullet > 1 > says that a composing character is "the second character in the canonical > decomposition mapping of some character". If there are characters in > Unicode that are made of a "base character" plus two or more composing > characters (I cannot claim to be positive that such characters exist, > but I > think that Hangul characters are often decomposed into three or more > Jamo; > there may be other examples), then surely "a composing character" would > be > "each character after the first in the canonical decomposition mapping of > some character". > > (4) In section 3.2.4, Fully-normalized text, first numbered list, bullet > 1 > refers to "some character that is not listed in the Composition Exclusion > Table defined in [UTR #15]". However, following the link to the most > recent > version of UTR #15, the section of that document whose title is > "Composition Exclusion Table" contains neither a table nor a list of > characters. While this is an apparent failure of UTF #15, the dependence > on that section of UTR #15 cascades that failure into > Normalization. However, there is (in section 6 of UTF #15) a (not > terribly > obvious) reference to "the Composition Exclusion Table [Exclusions]". > The > References entry with that name (Exclusions) contains pointers to several > versions of such a table, the latest of which is available at > <http://www.unicode.org/Public/UNIDATA/CompositionExclusions.txt>http://www.unicode.org/Public/UNIDATA/CompositionExclusions.txt. > It would have seemed a Very Good Idea for Normalization to point directly > to this file, perhaps in addition to the reference directly to UTF #16 > section 6. > > (5) In section 3.2.4, Fully-normalized text, second numbered list, > bullet 2 > uses the phrase "clause 1 above". I believe that most readers will > better > understand your meaning if you replace that with "bullet 1 above" or > "list > item 1 above". To many readers, the word "clause" refers either to a > major > subdivision of a document (e.g., a chapter) or to a relatively short > phrase > such as a portion of a sentence (e.g., the noun clause). > > (6) In section 3.2.4, Fully-normalized text, the paragraph beginning > "Identification of the constructs..." includes the statement that "it is > the responsibility of the specification for a language to specify exactly > what constitutes a relevant construct". Could you please clarify whether > or not the XQuery 1.0, XPath 2.0, and XSLT 2.0 suite of specifications > would be cited as non-conforming to this specification if (as I believe > to > be the case) they do not contain any such explicit specification? > > (7) In section 3.2.7, Certified and suspect text, the NOTE begins with > the > statement "To normalize text, it is in general sufficient to store the > last > seen character...". Perhaps I've missed something important earlier in > this specification, but I have no idea what that statement means. One > way > of explaining it is to use the example of text "C combining-cedilla". > When > processing that text, I store the last seen character > (combining-cedilla). And, violá, the text is normalized. But that > obviously is not the case. So what does that statement mean? Could it > be > expressed in a less ambiguous manner? > > (8) In section 3.4, Responsibility for normalization, item C303 includes > an > Example that uses the notations "xf:concat" and "xf:substring". In both > cases (because this document does not define any namespace prefixes > associated with the namespace name associated with XPath/XQuery > functions), > the "xf" should be replaced with "fn", which is the conventional prefix > used for that namespace. > > (9) In section 4, String identity matching, item C312, list item 1 > includes > the statement "In accordance with section > <http://www.w3.org/TR/2005/WD-charmod-norm-20051027/#sec-Normalization>3 > Normalization, this step MUST be performed by the producers of the > strings > to be compared." But section 3 does not make such a requirement (it did > so > in earlier drafts, but has been changed in this draft). At the very > least, > that use of "MUST" must (pun intended) be replaced by > "SHOULD". Furthermore, the requirement to use "Early uniform > normalization" might be correct because of the use of "as if" in the > preceding paragraph, but (as section 3 makes clear) late normalization > will > produce identical results. > > (10) In appendix A, the reference to XQuery Operators includes an > outdated > list of editors. Jonathan Robie is no longer cited as an editor of that > specification. Furthermore, the most recent edition is now dated 4 > November, 2005, and is a Candidate Recommendation. (Of course, because > Normalization was published earlier than that date, you could not have > known this fact; the next publication of Normalization should make this > change.) > > (11) In Appendix B, the final NOTE: says that certain characters may be > displayed as a blank or as a blank rectangle. In some situations (e.g., > Firefox 1.0.4 on my system without any font that covers Sinhala, a > question > mark ("?") is displayed. It might be appropriate to include that > possibility in this NOTE. > > > Hope this helps, > Jim > > ======================================================================== > Jim Melton --- Editor of ISO/IEC 9075-* (SQL) Phone: +1.801.942.0144 > Co-Chair, W3C XML Query WG; F&O (etc.) editor Fax : +1.801.942.3345 > Oracle Corporation Oracle Email: jim dot melton at oracle dot com > 1930 Viscounti Drive Standards email: jim dot melton at acm dot org > Sandy, UT 84093-1063 USA Personal email: jim at melton dot name > ======================================================================== > = Facts are facts. But any opinions expressed are the opinions = > = only of myself and may or may not reflect the opinions of anybody = > = else with whom I may or may not have discussed the issues at hand. = > ========================================================================
Received on Wednesday, 21 December 2005 06:02:36 UTC