- From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
- Date: Mon, 3 Dec 2012 17:16:00 +0100
- To: Robin Berjon <robin@w3.org>
- Cc: Henri Sivonen <hsivonen@iki.fi>, public-html WG <public-html@w3.org>, www-tag@w3.org
Robin Berjon, Mon, 03 Dec 2012 11:35:38 +0100: >> Regarding "2.1 How can an XML toolchain be used to consume HTML?" >> http://www.w3.org/TR/html-xml-tf-report/#uc01, > Saying "polyglot" here just doesn't help: very little real-world > content uses it. Note that the section clearly looks at polyglot and > gives a clear reason for not using it in this case. I agree. So why did the report say "polyglot" here? Why did you, negatively, bring it in? I see no relevance. To quote what you say at the bottom of this letter: This was not one of the "the uses it was designed for". >> Regarding "2.2 How can an HTML toolchain be used to consume XML?" >> http://www.w3.org/TR/html-xml-tf-report/#uc02 >> TF says: "the most successful approach may be to simply translate >> the XML to HTML5 before passing it to the HTML5 tool" >> Verdict: How come this section didn't evaluate Polyglot Markup? > > "Processing a real XML document with an HTML5 parser is probably > never going to be possible with complete fidelity." In general it's > not a problem you can solve. And polyglot (rightfully IMHO) doesn't > even try. What is "a real XML document"? The XML/HTML TF report, under the next point, point 2.3, points out that XHTML5 *is* real XML. If by "real XML" the TF meant something that is not HTML5, then it is not HTML5 but also not XHTML5. Thus it is not Polyglot Markup, in the strict sense. And also not HTML5 or XHTML5 in the strict sense. But it is of course possible to apply the principles of Polyglot Markup to extended XHTML5/HTML5. >> Regarding "2.3 How can islands of HTML be embedded in XML?" >> http://www.w3.org/TR/html-xml-tf-report/#uc03 >> TF says: EITHER, create HTML as "well-formed XML" = "requirements >> on the author" OR absolve the author by (having the tool) >> escaping markup. >> Verdict: How come you didn't mention having the tool output >> Polyglot Markup? > > It pretty much says either use XHTML (in which case you don't need > polyglot) That is your words. The report doesn't discuss it. And if the purpose of the document - this XML/HTML TF report - was *not* to discuss the use of polyglot markup, then what has that report to do in the discussion of Polyglot Markup? > or embed the HTML as text (in which case you don't need > polyglot). Recommending polyglot here would depend too much on the > specifics of the usage, and in general wouldn't help. First you say "use XHTML … then you don't need polyglot". Then in the next sentence you say "as text … in which case you don't need XHTML". To which I say: Fallacy. http://en.wikipedia.org/wiki/Fallacy Or how does it follow from this that "you don't need polyglot"? The only thing that follows from this is that the author has TWO options if the he has a polyglot in his hands (namely a choice between "as text" and "use XHTML"), and ONE option if he has non-XHTML-compatible HTML in is hands (namely as text). Is it a goal for you that the author must be capable of making the right choice? To me it is a goal that the author can't err regardless of what choice he makes. Which is why polyglot would have been relevant to mention for this use case. >> Regarding "2.4 How can islands of XML be embedded in HTML?" >> http://www.w3.org/TR/html-xml-tf-report/#uc04 >> TF says: Use <script> as XML container and use JavaScript to make it >> render in the DOM. >> Verdict: It seems like Polyglot Markup does not discuss that approach. >> If the TF document had purported to be an evaluation of >> Polyglot Markup, you would have discussed it. >> Also: I don't understand the last sentence: "Note also that >> polyglot markup is not an aid here as it forbids arbitrary >> XML content from the document." Does it? It doesn't any >> more than HTML5 proper does: If you add something that >> HTML5 doesn't permit, then it isn't HTML5 any more but >> "extended HTMl5". But clearly, it is possible to create >> "extended polyglot markup" - just apply its principles. > > That section's advice is mostly missing a mention of the pitfalls of > </script> IMHO. Including XML in <script> is definitely *not* > something that polyglot should recommend since you'd get very > different DOMs on either side. It's a useful technique when you know > you'll be parsed as HTML — and therefore clearly outside polyglot. OK. I agree on this one. >> Regarding "2.5 How can XML be made more forgiving of errors?" >> http://www.w3.org/TR/html-xml-tf-report/#uc05 >> TF says: XML5, error handling in XML etc. >> Verdict: Provided that the goal of the task force (improved >> "interoperability between HTML and XML") could be >> be helped by making XML fail in the exact way that >> HTML fails, then why did you not discuss Polyglot >> Markup as an option here? > > Because looking a potential future changes to XML is completely > outside the scope of polyglot. It's also completely different from > polyglot's goals. Sure. But, looking at the goals of that task force - "interoperability between HTML and XML", then if if someone produces an imperfect polyglot, then it would fail like HTML (if served as text/html, that is). Also, the TF doesn't tell us why - or how - introducing HTML-like error handling in XML improves the "interoperability between HTML and XML". For instance - just to bring in a question that you seemingly find it is OK to ask about Polyglot: Could we just skip Henri's parser if we introduces XML error handling in XML? Care to tell? My motivation for bringing in polyglot markup into this subject is very much related to the final paragraph of the preceding point 2.4, "How can islands of XML be embedded in HTML?". Because, in that paragraph, the TF deviates from the subject, by pointing out that instead of embedding XML in HTML (text/html), one might instead embed XML in XHTML (application/xhtml+xml). (Voila!) On that background, I find it quite relevant to point out that, hey, with regard to the problem of letting XML err as HTML, why not instead serve the XML as polyglot text/html? >> Verdict: The idea that this HTML parser could produce polyglot markup >> (and no: not in order to pee in the tag soup ocean, but in order to be >> a more useful parser in that tool chaing!), is never discussed. > > I'm not even sure what it would mean for an HTML parser to produce > polyglot markup. That question might have been answered in my other message (http://lists.w3.org/Archives/Public/public-html/2012Dec/0013). But to clarify: I am/was not certain about the role of Henri's parser. I thought that his parser would preprocess the tag soup HTML so that the XML tools then can work with XML rather than with HTML. At any rate, his parser - or the tool-chain as a whole - could produce a polyglot - for the benefit of whoever/whichever is going to subsequently work on that document. But true: From the point of view that Polyglot Markup has very strict rules, it would be demanding to e.g. convert a document containing an embedded script to something that conforms 100% to the rules of Polyglot Markup, since Polyglot Markup doesn't permit to embed a script directly in the page unless the script follows strict rules: http://www.w3.org/TR/html-polyglot/#script-and-style >> Over all, the report is trapped in some well known dichotomies. And >> Polyglot Markup is not considered in a serious way. The Task Force's >> report is a very thin basis for rescinding the request for robust, >> polyglot markup. > > Actually, we considered polyglot seriously. We found polyglot to be > useful for the uses it was designed for, but not applicable to all > cases in which XML/HTML interoperability is desirable. I think "the uses it was designed for" is a crucial statement. I continue to no understand why the Reports wastes ink on telling that Polyglot is not useful for the things it is not designed for - and does so even in the concluding statement. Also, to the extend that the goal of the Report was to say something about Polyglot, then I think that the report could have looked different if the TF had members from, shall we say, the polyglot community. -- leif halvard silli
Received on Monday, 3 December 2012 16:16:36 UTC