W3C home > Mailing lists > Public > www-validator@w3.org > August 2007

Re: Validator misbehavior on HTML sent as XHTML

From: Benjamin Niemann <pink@odahoda.de>
Date: Thu, 09 Aug 2007 23:03:59 +0200
To: www-validator@w3.org
Message-ID: <f9fvfv$s7u$1@sea.gmane.org>

Hi,

Nikita The Spider The Spider wrote:

> I've been running a few edge cases through the validator and I've come
> across one that the validator doesn't like. The document in question
> is a short, valid HTML 4.01 Strict document that gives the validator
> fits when I send it with a media type of application/xhtml+xml.
> Specifically, the validator reports "Validation Output: 6 Errors" and
> then proceeds to report hundreds of errors on lines that don't exist
> in the document.
> 
> The document in question is here:
> http://NikitaTheSpider.com/boneyard/temp/070808/nonsense.xhtml
> 
> And here's the validation URL for it:
>
http://validator.w3.org/check?uri=http%3A%2F%2Fnikitathespider.com%2Fboneyard%2Ftemp%2F070808%2Fnonsense.xhtml&charset=%28detect+automatically%29&doctype=Inline&ss=1&group=0
> 
> I realize that sending HTML as application/xhtml+xml is a nonsensical
> thing to do 

I've seen worse things on the web ;)

> and the validator is right to tell me that ("Contradictory 
> Parse Modes Detected!") but the actual output is clearly the result of
> parsing something other than my document.

These errors actually are real error and come from the HTML DTD, when it is
parsed in XML mode. SGML and XML DTDs are similar, but just as with
document markup XML allows only a subset of the SGML constructs.

As it looks, the validator is not (yet) able to correctly report errors from
different entities. The line numbers look as if the point to the correct
lines of the DTD, but column 0 is obviously bogus. It also only counts
error in the nonsense.xhtml entity. And finally it tries to link errors
from the DTD to the document source, which is not the source of these
errors.



-- 
Benjamin Niemann
Email: pink at odahoda dot de
WWW: http://pink.odahoda.de/
Received on Thursday, 9 August 2007 21:04:37 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:25 GMT