- From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
- Date: Tue, 6 Nov 2012 19:01:44 +0100
- To: Lachlan Hunt <lachlan.hunt@lachy.id.au>
- Cc: public-html@w3.org
Lachlan Hunt, Tue, 06 Nov 2012 15:52:24 +0100: > On 2012-11-06 15:17, Leif Halvard Silli wrote: > The exceptions I listed are cases where the inclusion of certain > markup results in necessary, but semantically insignificant > differences from parsing, and where the markup is still conforming in > both serialisations. It is not *necessary* to allow CDATA in polyglot markup or relax the whitespace restriction. But I can understand that you, as an author, find it (perhaps _very_) *impractical* to have those restrictions. Hence, I could live with your relaxation of <style>, <script> and whitespace. > Non-UTF-8 encodings are conforming in both > serialisations and there is no need for such a restriction. Polyglot Markup has a focus on authoring. On being a practical choice. Authors would be free to create such non-UTF-8 polyglot documents - no one would be able to prevent it. But those who do that kind of thing - *and* want to retain the 'polyglot' badge as well, do in fact not need that badge … Also, I note that on one side you advocate, for authoring reasons, to relax the rules for white-space and CDATA. On the other side you want to open up for encodings it would be very difficult for authors to deal with given the restrictions on how to declare them. > I will maintain an objection to any normative definition of polyglot > markup that imposes additional restrictions on conforming markup that > are not derived directly from the conforming intersection of the HTML > and XHTML serialisations. If authoring-driven desire for CDATA inside polyglot markup can be labelled as "derived directly from the conforming intersection of the HTML and XHTML serialisations", then how can one credibly claim that the UTF-8 restriction is not straight from the HTML and XHTML serialization as well? <signFromAbove>Because, after all, HTML5 forbids <meta charset="UTF-16"/> and <?xml version="1.0" encoding="FOO" ?> in 'text/html'. And it forbids <meta http-equiv="text/html" content="FOO" /> and <meta charset="NON_utf8" /> in XHTML5. Except that it permits the latter element if its @charset value is "UTF-8" (<meta charset="UTF-8"/>).</signFromAbove> > That is, if something is conforming in both serialisations and does > not result in a significant semantic difference in interpretation > between HTML and XML parsers, then it should be considered conforming > polyglot markup. > > I have no objection, however, to strongly recommending the use of > UTF-8, as long as it is non-normative. For various reasons - including the wetting process it has had, I am not going to back down from the UTF-8 requirement. But I am willing to make one concession: I would not be opposed to an informative note attached to the principles, which said something like this: "As long as one either uses the UTF-16 encoding (with a BOM) or control the encoding externally (e.g. via HTTP), then the rules of this specification would turn any XHTML document into a HTML-compatible one. However, as far as this specification is concerned, then only UTF-8 encoded documents are considered to be conforming." -- Leif Halvard Silli
Received on Tuesday, 6 November 2012 18:02:20 UTC