- From: John Cowan <cowan@mercury.ccil.org>
- Date: Mon, 17 Sep 2012 20:16:59 -0400
- To: "Henry S. Thompson" <ht@inf.ed.ac.uk>
- Cc: liam@w3.org, public-xml-core-wg <public-xml-core-wg@w3.org>
Henry S. Thompson scripsit: > Of the 392 files with PIs, 40 were not well-formed (that is, 10.2%), > with the following problems as reported by rxp [1]: Probably only about half that many are real well-formedness (wf) errors. > Error: Document ends too soon > Error: EOE in PI [3 of these] > Error: Expected ; after entity name, but got = [4 of these] > Error: Expected > at end of entity declaration, but got - > Error: Expected name, but got & for entity > Error: Expected whitespace or tag end in start tag Can't argue with those. > Error: Input error: Illegal UTF-8 byte 2 <0x20> > Error: Input error: Illegal UTF-8 byte 2 <0x20> > Error: Input error: Illegal UTF-8 byte 2 <0x2e> > Error: Input error: Illegal UTF-8 byte 2 <0x65> > Error: Input error: Illegal UTF-8 start byte <0xa0> Probably the results of blind transcoding, believing a junk Content-type header, or other screwups. Only technically not-wf. > Error: Input error: Illegal character <0x0> [11 of these] > Error: Mismatched end tag: expected </abbr>, got </a> Can't argue with these either. > Error: Unknown declared encoding GB2312 > Error: Unknown declared encoding ISO8859-1 > Error: Unknown declared encoding TIS-620 > Error: Unknown declared encoding gb2312 > Error: Unknown declared encoding uft-8 [2 of these] > Error: Unknown declared encoding windows-1251 [2 of these] > Error: Unknown declared encoding windows-1252 [2 of these] > Error: Unknown declared encoding x-user-defined Those aren't wf errors, just limitations on rxp's ability to cope with random encodings, though I grant that uft-8 is probably not a legitimate encoding. -- I don't know half of you half as well John Cowan as I should like, and I like less than half cowan@ccil.org of you half as well as you deserve. http://www.ccil.org/~cowan --Bilbo
Received on Tuesday, 18 September 2012 00:17:22 UTC