Re: Character encoding errors (detailed review of parsing algorithm)

On Wed, 18 Jul 2007, Henri Sivonen wrote:
> 
> (This is part of my detailed review of the parsing algorithm.)
> 
> The spec says:
> > Bytes or sequences of bytes in the original byte stream that could not be
> > converted to Unicode characters must be converted to U+FFFD REPLACEMENT
> > CHARACTER code points.
> 
> The spec should probably say explicitly that such byte sequences are 
> parse errors.

They're not parse errors, they're errors at the character encoding layer. 
IMHO that's out of scope for this spec. In particular I don't think any of 
the text for parse errors need apply to encoding errors, the encoding 
specs should be the ones that make such errors non-conforming. No?

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'

Received on Wednesday, 1 August 2007 05:12:18 UTC