W3C home > Mailing lists > Public > xml-editor@w3.org > July to September 2009

Char comment wrong?

From: Barclay, Daniel <daniel@fgm.com>
Date: Tue, 15 Sep 2009 13:54:27 -0400
Message-ID: <4AAFD4D3.1040606@fgm.com>
To: <xml-editor@w3.org>
In the XML specification, the Char production at
http://www.w3.org/TR/2008/REC-xml-20081126/#NT-Char says:


[2] Char ::= #x9 | #xA | #xD
              | [#x20-#xD7FF] | [#xE000-#xFFFD]
              | [#x10000-#x10FFFF]

	/* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */

The comment appears to be inconsistent:
- Unicode appears to include all ASCII control characters.  (E.g.,
   code points U+0000 through U+001F all are assigned in the chart at
   http://unicode.org/charts/PDF/U0000.pdf.)
- The Char production excludes some (most) of those control characters.
- The comment lists exclusions (to start with the set "any Unicode
   character" and narrow it down to the correct set).
- However, the comment does not mention the excluded control characters.



Daniel
-- 
(Plain text sometimes corrupted to HTML "courtesy" of Microsoft Exchange.) [F]
Received on Tuesday, 15 September 2009 17:54:19 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:59:41 GMT