W3C home > Mailing lists > Public > public-html@w3.org > July 2010

Re: Polyglot Markup/XML encoding declaration

From: Lachlan Hunt <lachlan.hunt@lachy.id.au>
Date: Tue, 27 Jul 2010 13:21:03 +0200
Message-ID: <4C4EC11F.9030906@lachy.id.au>
To: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
CC: HTMLwg <public-html@w3.org>, Eliot Graff <eliotgra@microsoft.com>, public-i18n-core@w3.org
On 2010-07-23 16:26, Leif Halvard Silli wrote:
> Proposal: Polyglot Markup should allow the document encoding to be set
> via the encoding attribute of the XML declaration. The XML declaration,
> including the encoding attribute, thus becomes a HTML5 extension,
> whenever polyglot markup is being consumed as HTML. (See my previous
> letter to Sam, about the XML declaration as polyglot markup indicator.)

I object to this because permitting the XML declaration would only serve 
to pollute the document with unnecessary markup, and to mislead authors 
about how the encoding of a file is actually determined.

There have been many observed instances of otherwise useless markup 
being used by misled authors in ways that don't actually do anything. 
Many of these cases have now been made optional or obsolete in HTML5 
because of the wasted effort they were causing, and so introducing new 
markup with no real purpose would not be wise.

The only visible, in-file encoding declaration that should be permitted 
in polyglot documents is <meta charset="UTF-8"/> (or the http-equiv 
alternative).  That is the only one which actually serves to usefully 
declare the encoding for HTML that matches the default for XHTML. 
UTF-16 polyglot documents don't need an explicit declaration like that. 
They can rely on the BOM, or external encoding metadata.

Lachlan Hunt - Opera Software
Received on Tuesday, 27 July 2010 11:21:38 UTC

This archive was generated by hypermail 2.4.0 : Saturday, 9 October 2021 18:45:21 UTC