Re: Is the P-word? (Was: TAG Decision on Rescinding the request to the HTML WG to develop a polyglot guide)

On Tue, Jan 22, 2013 at 4:15 PM, Leif Halvard Silli
<xn--mlform-iua@xn--mlform-iua.no> wrote:
> Henri Sivonen, Tue, 22 Jan 2013 10:55:53 +0200:
>
>> Please don't support outputting encodings other than UTF-8.
>
> Polyglot.
>
>> Either way, XML processors are required to support UTF-8 and UTF-16.
>> Support for other encodings is optional. In other words, other
>> encodings are not guaranteed to work.
>
> Only Polyglot - and neither HTML5 or XML - limits the encoding to UTF-8.
>
>>>   - xhtml 1 or 1.1
>>>   - html5, xml serialization, not poyglot
>>>   - html5, xml serialization, polyglot
>>>
>>> I can make the difference between the first and the two last ones based
>>> on the doctype and friends. I am unable to make any difference between
>>> the two last ones.
>>
>> Don't support polyglot. Problem solved.
>
> Don't support non-polyglot. Problem solved. (Don't change your message
> because of the label.)

What is the reason that
<http://dev.w3.org/html5/html-xhtml-author-guide/#content-type> says

<blockquote>
The HTTP Content-Type: header has no extra rules or restrictions,
whereas polyglot markup does not use the http-equiv="Content-Type"
declaration on the meta element.
</blockquote>

?

As I read HTML5 and prior specs, @http-equiv='Content-Type' doesn't
have much meaning other than to (maybe) declare the charset encoding
for the doc. The TAG says
<http://www.w3.org/2001/tag/doc/mime-respect#intro>:

<blockquote>
Metadata received in an encapsulating container, such as the metadata
within the header fields of a message that describe the data enclosed
within that message, is authoritative in defining the nature of the
data received.
</blockquote>

See also <http://www.w3.org/2001/tag/doc/mime-respect#embedded>.

This suggests to me that putting something like

<meta http-equiv="Content-Type" content="application/xhtml+xml" />

is a potential way to indicate to text/html consumers that this
representation is also parseable by an XML parser and interpretable by
an XHTML renderer.

Is this ill-advised for some reason? Is there a pitfall here of which
I am ignorant?

It would be nice to embed useful metadata indicating that the present
representation is intended to have identical semantics under different
media types' interpretations. This would give multi-modal consumers a
means to leverage both HTML and XML processing on the document if so
instructed.

Thoughts?

David

Received on Wednesday, 23 January 2013 05:18:54 UTC