- From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
- Date: Wed, 28 Jul 2010 20:28:27 +0300
- To: Henri Sivonen <hsivonen@iki.fi>
- Cc: HTMLwg <public-html@w3.org>, Eliot Graff <eliotgra@microsoft.com>, public-i18n-core@w3.org
Henri Sivonen, Mon, 26 Jul 2010 11:33:38 +0300: > On Jul 23, 2010, at 17:26, Leif Halvard Silli wrote: > >> Proposal: Polyglot Markup should allow the document encoding to be set >> via the encoding attribute of the XML declaration. > > I strongly object to proposals that either make syntax looking like > an XML declaration conforming in HTML It must be a literal XML declaration - no lookalike. So, what you are really after, is to _change_ the current situation, where text/html permits the XML declaration, via XHTML 1.0, Appendix C. For example the W3 sponsored editor Amaya by default both inserts the XML declaration *and* uses the .html file suffix. The current state of affairs when it comes Appendix C polyglots, is that it is permitted. The cat is out of the sack 11 years ago. > or that extend the HTML charset > sniffing in any way that uses polyglotness as the rationale. I agree that UAs should not have to sniff. And if both meta@charset and XML declaration are present, then there will be no extension of the encoding sniffing. It would be fully in the tradition of polyglot documents to require both a HTML-compatible method and a XML-compatible method for setting the encoding. Just consider xml:lang and lang. Thus a simple rule: If you use the XML encoding declaration, then a equivalent meta@charset element is a MUST. If this rules out <?xml version="1.0" encoding="UTF-16" ?> since <meta charset="UTF-16"/> is forbidden, then that's OK. > This stuff is complex enough as it is. Complexity evaluation is outside the pure spec inference task. One of the complexities is how to tell a HTML tool or a XML tool to produce polyglot syntax instead of native only syntax. Here is my idea: Currently, meta@charset is meaningless in XHTML. But what if XHTML tools interpreted it as a signal to produce polyglot syntax? The presence of XML declaration could play the same role for HTML tools. Though, really, it is the presence of both artifacts that should be polyglot indicator: the meta@charset together with xml declaration should be a quite certain signal to both XHTML and HTML tools and authors. E.g. a typical polyglot - UTF-8 encoded, that is - could start like this: ]] <?xml version="1.0" ?> <!DOCTYPE html> <head> <meta charset="UTF-8"/> [[ This is why we should discuss the XML declaration and the XML encoding declaration separately. And, btw, if you would like to suggest another way to discern polyglot documents from HTML and XML documents, then I am all ear! -- leif halvard silli
Received on Thursday, 29 July 2010 12:51:52 UTC