- From: Henri Sivonen <hsivonen@iki.fi>
- Date: Tue, 19 Jun 2007 08:50:52 +0300
- To: HTML WG <public-html@w3.org>
I could use test documents that are otherwise small conforming HTML5 documents in encoding where a character may take more than one byte (with the encoding declared using the BOM or <meta charset='...'>) except that they contain a byte sequence that is bogus for the declared encoding: non-shortest-form UTF-8, unpaired surrogates in UTF-16, broken Shift_JIS with the kind of brokenness you could get in Shift_JIS (I don't know what exactly I should be testing with non-UTF encodings). If someone already has this kind of test data, please let me know. Thanks. -- Henri Sivonen hsivonen@iki.fi http://hsivonen.iki.fi/
Received on Tuesday, 19 June 2007 05:47:56 UTC