[Bug 21818] XHTML5: Permit <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/> from bugzilla@jessica.w3.org on 2013-05-14 (public-html-bugzilla@w3.org from May 2013)

From: <bugzilla@jessica.w3.org>
Date: Tue, 14 May 2013 14:03:13 +0000
To: public-html-bugzilla@w3.org
Message-ID: <bug-21818-2486-7KZls7cyWf@http.www.w3.org/Bugs/Public/>

https://www.w3.org/Bugs/Public/show_bug.cgi?id=21818

Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |xn--mlform-iua@xn--mlform-i
                   |                            |ua.no

--- Comment #3 from Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no> ---
(In reply to comment #2)
> (In reply to comment #0)
> 
> > JUSTIFICATION: This encoding declaration is more robust than the
> > meta@charset declaration. 
> 
> [citation needed]

First, the primary justification is of course the same justification that HTML5
gives for allowing <meta charset="UTF-8"/> in XHTML5:
   ]] to facilitate migration to and from XHTML.[[
  
http://www.w3.org/html/wg/drafts/html/master/document-metadata.html#attr-meta-charset

That justification is, in turn, related to what HTML5 says about the
http-equiv=Content-Type and meta@charset being equivalent features:
   ]] The Encoding declaration state is just an alternative form of setting
      the charset attribute: it is a character encoding declaration. [[
  
http://www.w3.org/html/wg/drafts/html/master/document-metadata.html#attr-meta-http-equiv-content-type

Being an alternative form, it makes little sense to have different usage rules
for the two of them.


As for the additional justification, about "more robust than the meta@charset
declaration", then let me cite the initial comment in bug 21174:

> 4. However, fact is that in some implementation segments, the @charset
> variant is not supported. For instance OpenOffice, on last check, did not
> support <meta charset="UTF-8"/>. Thus, if the authors wants to support such
> implementations, he/she has to not conformin to the polyglot spec

  (OpenOffice importer also failed to understand the BOM, 
   and doesn't understand HTTP.)

   Btw - Google Docs, when saving/'download as HTML', skips the encoding
declaration and, instead, settles for character entities for non-ASCII content,
*maybe* because they wish to increase compatibility e.g. with consumers such as
OpenOffice (since OpenOffice do support character entities but don't support
<meta charset="FOO"/>).

Other examples:

1) The XHTML5/HTML5 compatible WYSIWYG editor Freeway Pro 6.0.8 (anno 2013)
sports (despite my bug report, btw) an import engine that doesn't support BOM
or <meta charset/>

2) The HTML parser of XMLlib before version 2.8 does not understand <meta
charset="UTF-8"/>
   (In 2.8, the charset declaration seems to work - as characters are rendered
as character entities, but the http-equiv variant seems to work *better*, as
characters are then rendered as is.)

Other justifications:

3) There are UTF-8 capable (legacy) authoring tools that can insert (obsolete,
but conforming) DOCTYPEs *and* <meta http-equiv>, but which cannot insert <meta
charset>. One such tool is Amaya (which e.g. my wife *insists* on using) -
http://www.w3.org/Amaya/

4) Sam said,
http://intertwingly.net/blog/2012/11/09/In-defence-of-Polyglot#c1352476295 
   "For example, I not only always use utf-8, but I also always 
    declare such BOTH in a meta tag AND in the content type."
   My *guess* is that *one* justification for Sam’s habit, is related to lack
of 100% support for <meta charset="UTF-8"/>

-- 
You are receiving this mail because:
You are the QA Contact for the bug.

Received on Tuesday, 14 May 2013 14:03:28 UTC