W3C home > Mailing lists > Public > public-html@w3.org > January 2013

Re: Is the P-word? (Was: TAG Decision on Rescinding the request to the HTML WG to develop a polyglot guide)

From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
Date: Wed, 23 Jan 2013 10:11:49 +0100
To: David Sheets <kosmo.zb@gmail.com>
Cc: Henri Sivonen <hsivonen@iki.fi>, Daniel Glazman <daniel@glazman.org>, Sam Ruby <rubys@intertwingly.net>, Noah Mendelsohn <nrm@arcanedomain.com>, "www-tag@w3.org List" <www-tag@w3.org>, "public-html@w3.org" <public-html@w3.org>
Message-ID: <20130123101149251672.0cd0fbe3@xn--mlform-iua.no>
David Sheets, Tue, 22 Jan 2013 21:18:00 -0800:

> What is the reason that
> <http://dev.w3.org/html5/html-xhtml-author-guide/#content-type> says
> 
> <blockquote>
> The HTTP Content-Type: header has no extra rules or restrictions,
> whereas polyglot markup does not use the http-equiv="Content-Type"
> declaration on the meta element.
> </blockquote>

The Polyglot Markup spec limits itself to define a subset of the HTML5 
spec, which permits meta@charset=UTF-8 in both XHTML code and HTML 
code, whereas the HTML5 spec only permits meta@http-equiv in HTML code.

> This suggests to me that putting something like
> 
> <meta http-equiv="Content-Type" content="application/xhtml+xml" />

A case could be made for allowing 'text/html;charset=UTF-8' in XHTML5 
since meta@charset has somewhat limited support outside the GUI browser 
world. For instance, Microsoft Word and Open Office doesn't support 
<meta charset="UTF-8"/>. Which, I have to admit, feels like a pain in 
polyglot’s robustness principle ass. ;-) But then again: If you 
export/download a Google Docs document (from Google Drive) as HTML, you 
will find that it contains no encoding declaration (and no DOCTYPE for 
that matter) - all the non-ASCII is converted to numerical character 
entities.

> is a potential way to indicate to text/html consumers that this
> representation is also parseable by an XML parser and interpretable by
> an XHTML renderer.
> 
> Is this ill-advised for some reason? Is there a pitfall here of which
> I am ignorant?
> 
> It would be nice to embed useful metadata indicating that the present
> representation is intended to have identical semantics under different
> media types' interpretations. This would give multi-modal consumers a
> means to leverage both HTML and XML processing on the document if so
> instructed.

If you meant that one could include two meta based encoding decalraiton 
elements in the same document, then HTML5 forbids that as well.
http://www.w3.org/html/wg/drafts/html/master/document-metadata.html#charset

-- 
leif halvard silli
Received on Wednesday, 23 January 2013 09:12:20 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 23 January 2013 09:12:21 GMT