Validator result html may not be well formed XML (validator version 0.7.1)?

Hi!

I load and parse the validator result html with 4suite's XML tools for
Python. From time to time I notice that the validator result HTML can
contain invalid tokens which makes it impossible to load the XHTML into a
parser.

An example:
http://validator.w3.org/check?&outline=1&verbose=1&uri=http://www.clii.com.cn/

If you load the result page of the link above into an XML parser you will
get: "line 1266, column 166: not well-formed (invalid token)".

Am I doing something wrong or does the validator have a hard time with
certain types of source document encoding?

Kind regards,

Peter Krantz

Received on Monday, 6 February 2006 13:30:00 UTC