Re: Latest CR-ready draft for publicatiopn from Rick Jelliffe on 2002-09-19 (www-xml-blueberry-comments@w3.org from September 2002)

From: Rick Jelliffe <ricko@topologi.com>
Date: Thu, 19 Sep 2002 09:56:03 -0500
To: www-xml-blueberry-comments@w3.org
Message-Id: <4.3.2.7.2.20020919095515.02756248@172.27.10.30>

[Forwarding this comment against
http://www.w3.org/XML/Group/2002/09/CR-xml11-20020912
to the comments list. paul]

> Nit: in the Introduction, next-to-last paragraph, there is:
>  "In order to improve the robustness of character detection,..."
> I believe this should be "...character *encoding* detection..."

Or, better, "in order to improve the detection of mislabelled
character encodings."    Robustness seems the wrong word
somehow.

Also, the Introduction does not mention normalization.

As a side issue, it would be consistant for the non-letter characters in
Unicode Latin1 block (which are occupied by multiplication and divide symbol
in ISO 8859-1 and won't be allocated over) were treated the same
as all the other non-name characters in C1.  These code points
are often the only things that allow detection of incorrect encodings 
between ISO 8859-n sets.   It only involves blocking D7 and F7.

[4]     NameStartChar := ":" | [A-Z] | "_" | [a-z] |
         [#xC0-#xD6] | [#xD8-#xF6] | [x3F8-#x2FF]
         | [#x370-#x37D] | [#x37F-#x1FFF] |
         [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] |
         [#x3001-#xD7FF] | [#xF900-#xEFFFF]

I don't believe I have any other issues. Good on you all.

Cheers
Rick Jelliffe

Received on Thursday, 19 September 2002 10:56:35 UTC