HTML/XML TF Report glosses over Polyglot Markup (Was: Statement why the Polyglot doc should be informative)

CC-ing the TAG,

Henri Sivonen, Fri, 30 Nov 2012 18:27:35 +0200:
> On Wed, Nov 28, 2012 at 12:41 PM, Sam Ruby … wrote:
>> We still have Formal Objections which have not been withdrawn.  Are we to
>> proceed within the working group or are we to forward this matter to the
>> Director?
> 
> As Maciej said, the purpose of
> http://lists.w3.org/Archives/Public/public-html/2012Nov/0006.html was
> to serve as a way to go to a poll within the WG. My expectation is to
> proceed to a poll within the WG. However, before doing so, it would be
> interesting to see the TAG’s take on
> http://lists.w3.org/Archives/Public/www-tag/2012Nov/0047.html .

I did not take part in the HTML/XML Task Force. But I am critical about 
what the report says (very little!) about polyglot markup.  Here are my 
comments on that report, from that angle:

Regarding "2.1 How can an XML toolchain be used to consume HTML?"
          http://www.w3.org/TR/html-xml-tf-report/#uc01, 
 TF says: In the problem refinement, the TF go astray, by replacing
          "HTML" with "Web" (how can an XML toolchain be used to
          consume the Web), quote: "HTML is not guaranteed (or even
          likely, […] to be well-formed". As soon as you replaced
          HTML with Web, then Polyglot Markup in reality went out
          the window. With that problem description, the only role
          of Polyglot Markup becomes as  *output format* for
          the bespoke toolchain, but that use, is never discussed.
 Verdict: W.r.t. Polyglot Markup, section 2.1 mixes up the arguments

Regarding "2.2 How can an HTML toolchain be used to consume XML?"
          http://www.w3.org/TR/html-xml-tf-report/#uc02

 TF says: "the most successful approach may be to simply translate
          the XML to HTML5 before passing it to the HTML5 tool"
 Verdict: How come this section didn't evaluate Polyglot Markup?

Regarding "2.3 How can islands of HTML be embedded in XML?"
          http://www.w3.org/TR/html-xml-tf-report/#uc03

 TF says: EITHER, create HTML as "well-formed XML" = "requirements
          on the author" OR absolve the author by (having the tool)
          escaping markup.
 Verdict: How come you didn't mention having the tool output
          Polyglot Markup?

Regarding "2.4 How can islands of XML be embedded in HTML?" 
          http://www.w3.org/TR/html-xml-tf-report/#uc04

 TF says: Use <script> as XML container and use JavaScript to make it
          render in the DOM. 
 Verdict: It seems like Polyglot Markup does not discuss that approach.
          If the TF document had purported to be an evaluation of 
          Polyglot Markup, you would have discussed it.
    Also: I don't understand the last sentence: "Note also that 
          polyglot markup is not an aid here as it forbids arbitrary 
          XML content from the document." Does it? It doesn't any 
          more than HTML5 proper does: If you add something that 
          HTML5 doesn't permit, then it isn't HTML5 any more but 
          "extended  HTMl5". But clearly, it is possible to create
          "extended polyglot markup" - just apply its principles.

Regarding "2.5 How can XML be made more forgiving of errors?"
          http://www.w3.org/TR/html-xml-tf-report/#uc05

 TF says: XML5, error handling in XML etc.
 Verdict: Provided that the goal of the task force (improved
          "interoperability between HTML and XML") could be
          be helped by making XML fail in the exact way that
          HTML fails, then why did you not discuss Polyglot
          Markup as an option here?
              
Regarding "3 Conclusions" 
          http://www.w3.org/TR/html-xml-tf-report/#conclusions

 TF says: Despite how little the body of the report deals with it, the 
last paragraph of the conclusions - on Polyglot Markup, takes up one 
third of it. And it reflects what was said under section 2.1: "One 
line" holds up polyglot markup and the robustness principle as one 
approach. But then "Another line" views it from the angle of "the Web", 
and dismisses it: "If you want to consume HTML content, use an HTML 
parser that produces an XML-compatible DOM or event stream."
 Verdict: The idea that this HTML parser could produce polyglot markup 
(and no: not in order to pee in the tag soup ocean, but in order to be 
a more useful parser in that tool chaing!), is never discussed. 

Over all, the report is trapped in some well known dichotomies. And 
Polyglot Markup is not considered in a serious way. The Task Force's 
report is a very thin basis for rescinding the request for robust, 
polyglot markup.
-- 
Leif Halvard Silli

Received on Friday, 30 November 2012 19:11:24 UTC