W3C home > Mailing lists > Public > www-tag@w3.org > December 2012

Re: Rescinding the request to the HTML WG to develop a polyglot guide

From: Noah Mendelsohn <nrm@arcanedomain.com>
Date: Tue, 04 Dec 2012 18:27:49 -0500
Message-ID: <50BE86F5.1020302@arcanedomain.com>
To: Henri Sivonen <hsivonen@iki.fi>
CC: "www-tag@w3.org List" <www-tag@w3.org>
Henri's note, which started this thread, quotes a part of the conclusion to 
the TAG's Task Force report. I think it's worth quoting the entire 
conclusion section, which is not long [1]

========================================
3 Conclusions

The Task Force considered several areas of interoperability that arose in 
these use cases: consuming HTML with XML tools, consuming XML with HTML 
tools, and embedding islands of one within the other. As described above, 
there are well understood boundaries within which any solution to each use 
case can operate. And within those boundaries, there exists today a 
solution that, while perhaps not wholly satisfactory, sits within those 
boundaries. No wholly satisfying solution appears possible within the 
accepted constraints; it would appear that we have already achieved the 
practical solutions.

With respect to the question of making XML more forgiving to errors, it's 
clear that some work has been done in this area and that it is possible to 
articulate coherent proposals for such change. We recommend further study 
within the XML community before determining how best to explore these changes.

On the question of Polyglot markup, there seems to be little consensus. One 
line of argument suggests that, to the extent that it is practical to obey 
the Robustness principle, it makes sense to do so. That is, if you're 
generating HTML markup for the web, and you can generate Polyglot markup 
that is also directly consumable as XML, you should do so. Another line of 
argument suggests that even under the most optimistic of projections, so 
tiny a fraction of the web will ever be written in Polyglot that there's no 
practical benefit to pursuing it as a general strategy for consuming 
documents from the web. If you want to consume HTML content, use an HTML 
parser that produces an XML-compatible DOM or event stream.
========================================

Henri uses a quote from the above as (one of) the supporting arguments for 
asking the TAG to rescind it's request that a polyglot specification be 
published "in TR space". We will consider that request on our TAG 
teleconference this Thursday.

Noah

[1] http://www.w3.org/TR/html-xml-tf-report/#conclusions

On 11/30/2012 11:16 AM, Henri Sivonen wrote:
> In March 2010, the following request from the TAG was conveyed to the HTML WG:
>
>>      The W3C TAG requests there should be in TR space a document
>>      which specifies how one can create a set of bits which can
>>      be served EITHER as text/html OR as application/xhtml+xml,
>>      which will work identically in a browser in both bases.
>>      (As Sam does on his web site.)
> http://lists.w3.org/Archives/Public/public-html/2010Mar/0703.html
>
> However, subsequently, the TAG requested the creation of an HTML–XML
> Task Force (of which I was a member) and the Task Force Report
> remarked “Another line of argument suggests that even under the most
> optimistic of projections, so tiny a fraction of the web will ever be
> written in Polyglot that there's no practical benefit to pursuing it
> as a general strategy for consuming documents from the web. If you
> want to consume HTML content, use an HTML parser that produces an
> XML-compatible DOM or event stream.”
> http://www.w3.org/TR/html-xml-tf-report/
>
> Considering that a Task Force created at the TAG’s request identified
> a non-polyglot-based approach of feeding HTML content into XML tooling
> and the alternative is more broadly applicable than polyglot, as it
> does not require the cooperation of the originator of the content,
> would the TAG, please, consider rescinding its earlier request to the
> HTML WG (quoted above) as having been obsoleted by later findings?
>
Received on Tuesday, 4 December 2012 23:28:18 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 4 December 2012 23:28:18 GMT