Validator result html may not be well formed XML (validator version 0.7.1)? from Peter Krantz on 2006-02-06 (www-validator@w3.org from February 2006)

From: Peter Krantz <peter.krantz@gmail.com>
Date: Mon, 6 Feb 2006 09:29:03 +0100
To: www-validator@w3.org
Message-ID: <7b9ad66d0602060029j3dfd4e24mc7f0ee6c6b6c056f@mail.gmail.com>

Hi!

I load and parse the validator result html with 4suite's XML tools for
Python. From time to time I notice that the validator result HTML can
contain invalid tokens which makes it impossible to load the XHTML into a
parser.

An example:
http://validator.w3.org/check?&outline=1&verbose=1&uri=http://www.clii.com.cn/

If you load the result page of the link above into an XML parser you will
get: "line 1266, column 166: not well-formed (invalid token)".

Am I doing something wrong or does the validator have a hard time with
certain types of source document encoding?

Kind regards,

Peter Krantz

Received on Monday, 6 February 2006 13:30:00 UTC