Re: TAG Decision on Rescinding the request to the HTML WG to develop a polyglot guide from Sam Ruby on 2013-02-18 (public-html@w3.org from February 2013)

From: Sam Ruby <rubys@intertwingly.net>
Date: Mon, 18 Feb 2013 12:53:59 -0500
To: public-html@w3.org
Message-ID: <51226AB7.8080504@intertwingly.net>

On 02/18/2013 12:34 PM, David Carlisle wrote:
> On 13/02/2013 22:36, Eric J. Bowman wrote:
>> Apparently I need to make this point, again:  If there was no
>> interest in polyglot, there would be no HTML parser in libxml2; its
>> presence, and widespread use if xsl-list is any indication,
>> indicates otherwise.
>
> I actually think pologlot spec is worth having (and probably I made as
> many bugzilla comments on it as anyone). I still have some issues with
> the wording but as a general idea I think it's fine...
>
> But I don't understand your comment there at all. The HTML parser in
> libxml2 (or tagsoup or Henri's validator.nu parsers in java) are exactly
> the reason that some people (not entirely unreasonably) say that it
> isn't needed. If you can put an HTML parser in front of an XML
> tool-chain then you can pull in unrestricted HTML syntax and you have no
> need to produce HTML documents following the polyglot guidelines which
> are designed to allow an HTML document to be fed to the tool-chain via
> an XML parser.

I disagree.

Boy I wish all "HTML parsers" supported unrestricted HTML syntax. 
Henri's parser is better than tagsoup which is much better than libxml2.

Heck, even Google's parser is buggy:

http://lists.w3.org/Archives/Public/www-archive/2013Feb/0059.html

At least in that case, the HTML spec declares such a page as invalid, 
and thereby attempts to encourage interoperability even in the face of 
imperfect implementations.

But that isn't the case for other interop problems.  Until that is 
addressed:

http://intertwingly.net/blog/2012/11/09/In-defence-of-Polyglot

> David

- Sam Ruby

Received on Monday, 18 February 2013 17:54:29 UTC