- From: Barclay, Daniel <daniel@fgm.com>
- Date: Tue, 15 Sep 2009 13:54:27 -0400
- To: <xml-editor@w3.org>
Received on Tuesday, 15 September 2009 17:54:19 UTC
In the XML specification, the Char production at
http://www.w3.org/TR/2008/REC-xml-20081126/#NT-Char says:
[2] Char ::= #x9 | #xA | #xD
| [#x20-#xD7FF] | [#xE000-#xFFFD]
| [#x10000-#x10FFFF]
/* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */
The comment appears to be inconsistent:
- Unicode appears to include all ASCII control characters. (E.g.,
code points U+0000 through U+001F all are assigned in the chart at
http://unicode.org/charts/PDF/U0000.pdf.)
- The Char production excludes some (most) of those control characters.
- The comment lists exclusions (to start with the set "any Unicode
character" and narrow it down to the correct set).
- However, the comment does not mention the excluded control characters.
Daniel
--
(Plain text sometimes corrupted to HTML "courtesy" of Microsoft Exchange.) [F]
Received on Tuesday, 15 September 2009 17:54:19 UTC