- From: Barclay, Daniel <daniel@fgm.com>
- Date: Tue, 15 Sep 2009 13:54:27 -0400
- To: <xml-editor@w3.org>
Received on Tuesday, 15 September 2009 17:54:19 UTC
In the XML specification, the Char production at http://www.w3.org/TR/2008/REC-xml-20081126/#NT-Char says: [2] Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] /* any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. */ The comment appears to be inconsistent: - Unicode appears to include all ASCII control characters. (E.g., code points U+0000 through U+001F all are assigned in the chart at http://unicode.org/charts/PDF/U0000.pdf.) - The Char production excludes some (most) of those control characters. - The comment lists exclusions (to start with the set "any Unicode character" and narrow it down to the correct set). - However, the comment does not mention the excluded control characters. Daniel -- (Plain text sometimes corrupted to HTML "courtesy" of Microsoft Exchange.) [F]
Received on Tuesday, 15 September 2009 17:54:19 UTC