W3C home > Mailing lists > Public > public-html@w3.org > July 2007

Character encoding errors (detailed review of parsing algorithm)

From: Henri Sivonen <hsivonen@iki.fi>
Date: Wed, 18 Jul 2007 12:02:41 +0300
Message-Id: <85BE6153-3AF3-49AF-8A5E-89E804A173CC@iki.fi>
To: "public-html@w3.org WG" <public-html@w3.org>

(This is part of my detailed review of the parsing algorithm.)

The spec says:
> Bytes or sequences of bytes in the original byte stream that could  
> not be converted to Unicode characters must be converted to U+FFFD  
> REPLACEMENT CHARACTER code points.

The spec should probably say explicitly that such byte sequences are  
parse errors.

-- 
Henri Sivonen
hsivonen@iki.fi
http://hsivonen.iki.fi/
Received on Wednesday, 18 July 2007 09:02:50 UTC

This archive was generated by hypermail 2.3.1 : Monday, 29 September 2014 09:38:47 UTC