- From: Lachlan Hunt <lachlan.hunt@lachy.id.au>
- Date: Thu, 29 Jul 2010 15:30:02 +0200
- To: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
- CC: HTMLwg <public-html@w3.org>, Eliot Graff <eliotgra@microsoft.com>, public-i18n-core@w3.org
On 2010-07-28 19:17, Leif Halvard Silli wrote: > Lachlan Hunt, Tue, 27 Jul 2010 13:21:03 +0200: >> On 2010-07-23 16:26, Leif Halvard Silli wrote: >>> Proposal: Polyglot Markup should allow the document encoding to be set >>> via the encoding attribute of the XML declaration. The XML declaration, >>> including the encoding attribute, thus becomes a HTML5 extension, >>> whenever polyglot markup is being consumed as HTML. (See my previous >>> letter to Sam, about the XML declaration as polyglot markup indicator.) >> >> I object to this because permitting the XML declaration would only >> serve to pollute the document with unnecessary markup, > > A polyglot may be served as XHTML. XML 1.0 does not consider the XML > declaration unnecessary pollution. A polyglot may be served as HTML too. HTML5 does consider the XML declaration to be non-conformant, and including it is unnecessary polution. > There are several things in a polyglot that is unnecessary from a > purist HTML point of view! The XML-inspired talismans in the HTML syntax are only permitted to the extent that they are required for XHTML compatibility. An XML declaration is not required in XML when the encoding is UTF-8 or UTF-16, nor when the encoding is declared externally, and so there is no requirement to permit it in HTML for the purpose of polyglot documents. >> and to mislead authors about how the encoding of a file is actually >> determined. > > I agree that it was bad of me to hint that a HTML consumed file should > be able to rely on the XML encoding declaration only. To remove any > doubt, I emphasize - stronger - that if the XML encoding declaration is > used, then the HTML encoding declaration - meta@charset - must also be > used. Authors learn by copying and pasting. If they see lots of markup in the wild using the XML declaration in HTML and that it appears to declare the encoding, they will copy it into their own and not understand that it doesn't do what they think. We've seen this scenario before when XHTML 1.0 started becoming popular and lots of documents were unnecessarily copying the XML declaration from each other, with many people falsely thinking that it either meant XML parsing would be used by browsers that supported it or that it declared the encoding. The practice only died out after people started realising it triggered quirks mode in IE6. We have no reason to start introducing it again. >> There have been many observed instances of otherwise useless markup >> being used by misled authors in ways that don't actually do anything. >> Many of these cases have now been made optional or obsolete in HTML5 >> because of the wasted effort they were causing, and so introducing >> new markup with no real purpose would not be wise. > > It is permitted in HTML already, under XHTML 1.0, Appendix C. Appendix C explicitly states: "For compatibility with these types of legacy browsers, you may want to avoid using processing instructions and XML declarations" But Appendix C contains no normative requirements either, and so it can't permit or deny anything. It only provide recommendations. It is also irrelevant because HTML5 does not permit it because it would be parsed as a bogus comment. Permitting it would thus require unnecessarily complicated changes to the parsing requirements for no benefit whatsoever. > The XML declaration would not be generally permitted in HTML - it would > only be permitted in polyglot markup. There is no way to make some syntax conforming for polyglot documents only. Such a requirement is unenforceable because the conforming polyglot document syntax is and should remain only the intersection of HTML and XHTML syntax. -- Lachlan Hunt - Opera Software http://lachy.id.au/ http://www.opera.com/
Received on Thursday, 29 July 2010 13:30:36 UTC