- From: Ian Hickson <ian@hixie.ch>
- Date: Thu, 22 May 2008 11:48:39 +0000 (UTC)
- To: Henri Sivonen <hsivonen@iki.fi>
- Cc: HTMLWG Tracking WG <public-html@w3.org>
On Fri, 14 Mar 2008, Henri Sivonen wrote: > > Consuming an entity says: > > Otherwise, if the number is zero, if the number is higher than 0x10FFFF, or > > if it's one of the surrogate characters (characters in the range 0xD800 to > > 0xDFFF), then this is a parse error; return a character token for the U+FFFD > > REPLACEMENT CHARACTER character instead. > > Preprocessing the input stream says: > > Any occurrences of any characters in the ranges U+0001 to U+0008, U+000E to > > U+001F, U+007F to U+009F, U+D800 to U+DFFF , U+FDD0 to U+FDDF, and > > characters U+FFFE, U+FFFF, U+1FFFE, U+1FFFF, U+2FFFE, U+2FFFF, U+3FFFE, > > U+3FFFF, U+4FFFE, U+4FFFF, U+5FFFE, U+5FFFF, U+6FFFE, U+6FFFF, U+7FFFE, > > U+7FFFF, U+8FFFE, U+8FFFF, U+9FFFE, U+9FFFF, U+AFFFE, U+AFFFF, U+BFFFE, > > U+BFFFF, U+CFFFE, U+CFFFF, U+DFFFE, U+DFFFF, U+EFFFE, U+EFFFF, U+FFFFE, > > U+FFFFF, U+10FFFE, and U+10FFFF are parse errors. (These are all control > > characters or permanently undefined Unicode characters.) > > I suggest making characters that are parse errors in the input stream > parse errors also when expanded from an NCR. Done. -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Thursday, 22 May 2008 11:49:27 UTC