W3C home > Mailing lists > Public > www-xml-blueberry-comments@w3.org > September 2002

Re: Latest CR-ready draft for publicatiopn

From: Rick Jelliffe <ricko@topologi.com>
Date: Thu, 19 Sep 2002 09:56:03 -0500
Message-Id: <>
To: www-xml-blueberry-comments@w3.org

[Forwarding this comment against
to the comments list. paul]

> Nit: in the Introduction, next-to-last paragraph, there is:
>  "In order to improve the robustness of character detection,..."
> I believe this should be "...character *encoding* detection..."

Or, better, "in order to improve the detection of mislabelled
character encodings."    Robustness seems the wrong word

Also, the Introduction does not mention normalization.

As a side issue, it would be consistant for the non-letter characters in
Unicode Latin1 block (which are occupied by multiplication and divide symbol
in ISO 8859-1 and won't be allocated over) were treated the same
as all the other non-name characters in C1.  These code points
are often the only things that allow detection of incorrect encodings 
between ISO 8859-n sets.   It only involves blocking D7 and F7.

[4]     NameStartChar := ":" | [A-Z] | "_" | [a-z] |
         [#xC0-#xD6] | [#xD8-#xF6] | [x3F8-#x2FF]
         | [#x370-#x37D] | [#x37F-#x1FFF] |
         [#x200C-#x200D] | [#x2070-#x218F] | [#x2C00-#x2FEF] |
         [#x3001-#xD7FF] | [#xF900-#xEFFFF]

I don't believe I have any other issues. Good on you all.

Rick Jelliffe
Received on Thursday, 19 September 2002 10:56:35 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 23:07:42 UTC