Re: HTML5 validation message from Ville Skyttä on 2010-11-09 (www-validator@w3.org from November 2010)

From: Ville Skyttä <ville.skytta@iki.fi>
Date: Tue, 9 Nov 2010 20:02:21 +0200
To: www-validator@w3.org
Message-Id: <201011092002.22167.ville.skytta@iki.fi>

On Tuesday 09 November 2010, Michael(tm) Smith wrote:
> Thomas Gambet <tgambet@w3.org>, 2010-11-08 08:53 -0500:
> > I'm forwarding the following message that was posted on Unicorn's track:
> > 
> > [[
> > When a HTML5 page validates[1] without errors there's a message that says
> > "This means that the resource in question identified itself as "HTML5"
> > and that we successfully performed a formal validation using an SGML,
> > HTML5 and/or XML Parser(s) (depending on the markup language used)."
> > 
> > Surely this cannot be true, as HTML5 has no SGML serialization, and the
> > switch between HTML and XML parser depends on the MIME type, not "the
> > markup language used".
> > 
> > In other words, either this text is wrong, or the validator is actually
> > doing it wrong.
> > 
> > [1]
> > http://validator.w3.org/check?uri=http%3A%2F%2Ffoolip.org%2Fmicrodatajs%
> > 2Flive%2F ]]
> 
> The validator.nu backend code that's used for HTML5 validation is
> definitely not ever using an SGML parser.
> 
> But I don't know whether there are cases when the W3C Perl code is doing
> SGML parsing on HTML5 documents for some reason.
> 
> I also wondered why it says "XML Parser(s)" instead of just "XML parser"
> but then I remembered that there are some cases where the Perl frontend
> runs a document through libxml2 to do a well-formedness check even when you
> are serving a document as text/html.
> 
> So I think that should be changed to:
> 
>   ...we successfully performed a formal validation using an HTML5 and/or
>   XML parser(s) (depending on the MIME type with which the document is
>   being served)

The development version at http://qa-dev.w3.org/wmvs/HEAD/ is now more 
explicit about which parsers were actually used.

I removed the "depending on" part for now because the info may have been 
inaccurate and the actual reasons why some parsers were used are not yet 
available in human readable form, and I believe the semi-human-readable ones 
may not be descriptive/up-to-date enough.  Just hardwiring info like this to 
the templates is something I don't like - it's subject to bitrot (which we see 
in action right now with the current "production" validator message).

Received on Tuesday, 9 November 2010 18:02:55 UTC