- From: Kang-Hao (Kenny) Lu <kanghaol@oupeng.com>
- Date: Fri, 06 Sep 2013 11:05:56 +0800
- To: Geoffrey Sneddon <foolistbar@googlemail.com>, WHAT Working Group <whatwg@whatwg.org>
(2013/09/06 6:08), Geoffrey Sneddon wrote: > The phrasing content section states: > >> Text nodes and attribute values must consist of Unicode characters, >> must not contain U+0000 characters, must not contain permanently >> undefined Unicode characters (noncharacters), and must not contain >> control characters other than space characters. This specification >> includes extra constraints on the exact value of Text nodes and >> attribute values depending on their precise context. > > And the pre-processing the input-stream section states: > >> Any occurrences of any characters in the ranges U+0001 to U+0008, >> U+000E to U+001F, U+007F to U+009F, U+FDD0 to U+FDEF, and characters >> U+000B, U+FFFE, U+FFFF, U+1FFFE, U+1FFFF, U+2FFFE, U+2FFFF, U+3FFFE, >> U+3FFFF, U+4FFFE, U+4FFFF, U+5FFFE, U+5FFFF, U+6FFFE, U+6FFFF, >> U+7FFFE, U+7FFFF, U+8FFFE, U+8FFFF, U+9FFFE, U+9FFFF, U+AFFFE, >> U+AFFFF, U+BFFFE, U+BFFFF, U+CFFFE, U+CFFFF, U+DFFFE, U+DFFFF, >> U+EFFFE, U+EFFFF, U+FFFFE, U+FFFFF, U+10FFFE, and U+10FFFF are parse >> errors. These are all control characters or permanently undefined >> Unicode characters (noncharacters). > > Note the first uses "Unicode characters", the second "characters" — the > former excludes surrogates as a conformance requirement. > > Note that every disallowed non-surrogate character is a parse error. Except U+0000 or am I missing something? > Therefore, it would make sense to make surrogates parse errors. > > It should be noted that they can only occur in the input stream if they > come from script (as they cannot be decoded from the input byte stream > as the decoders will never emit a surrogate). which means that this seems ... cubersome ... to implement in a conformance checker. Which reminds me, does # Conformance checkers must report at least one parse error # condition to the user if one or more parse error conditions exist # in the document and must not report parse error conditions if none # exist in the document. Conformance checkers may report more than # one parse error condition if more than one parse error condition # exists in the document. mean validator.nu and Firefox view source are non-conforming because they do nothing about document.write() ? I think we should exempt conformance checkers from scripts instead. Cheers, Kenny -- Web Specialist, Opera Sphinx Game Force, Oupeng Browser, Beijing Try Oupeng: http://www.oupeng.com/
Received on Friday, 6 September 2013 03:06:27 UTC