For robustness, allow <meta http-equiv="Content-Type" … UTF-8"/> in XHTML5. #polyglot

Bug 21174[1] against Polyglot Markup depends on bug 21818[2] against 
HTML5 proper. Would it be possible for the HTML5 spec’s editors to take 
a look at the latter bug in the near future? In my view, it should be 
simple to solve.

The issue in bug 21818  is that while <meta charset="FOO"/> is 
permitted in XHTML (on the condition that the value of @charset is 
'UTF-8'), the equivalent pragma directive in the encoding state (on the 
condition that the value of @content is 'text/html; charset=UTF-8'), is 
not permitted.

Given that HTML5 says that the pragma directive in the encoding state 
is, quote: "just an alternative form of setting the charset 
attribute",[3] there appears no reason to not allow it in XHTML, as 
long as it sets the UTF-8 encoding. Given what HTML5 says, there should 
be no need to pay any attention to the string 'text/html;'. And XML 
parsers are anyhow not looking at this element whether when they decide 
the encoding or when they decide the MIME type. Both elements - <meta 
charset="UTF-8"/> and the equivalent meta@http-equiv variant - are pure 
text/html features.

Our wish to add this variant of meta@http-equiv to Polyglot Markup, is 
part of Polyglot’s increased attention to the robustness principle.  
The use case for allowing it is that there are several HTML consumers 
out there (HTML import services found in everything from HTML 
generators to Office applications) which understand <meta 
http-equiv="Content-Type" … UTF-8" /> without whether understanding 
<meta charset="UTF-8"/>, the BOM or XML (and outside reach of HTTP). 
Occasionally there are also buggy HTML browsers (text browsers at 
least). One should not have to break out of polyglot profile in order 
to serve this class of consumers. 

We could add this to Polyglot without a change to HTML5 first. However, 
we like to keep the principle that we have had since the start, that 
Polyglot is a profile that is both conforming as XHTML5 and as HTML5. 
Hence we need HTMl5 to add it first.




leif halvard silli

Received on Thursday, 25 April 2013 02:38:02 UTC