W3C home > Mailing lists > Public > public-xml-core-wg@w3.org > January 2012

Re: Other possible issues with the Polyglot draft

From: Henry S. Thompson <ht@inf.ed.ac.uk>
Date: Wed, 11 Jan 2012 17:51:48 +0000
To: public-xml-core-wg@w3.org
Message-ID: <f5bboqarutn.fsf@calexico.inf.ed.ac.uk>
ht writes:

> 1) It recommends [1] the use of the UTF-8 BOM -- that seems . . . odd
>    to me.

OK, I've done some further checking, and we can't have either

 <meta http-equiv="Content-type" content="text/html; charset=utf-8"/>
or
 <meta http-equiv="Content-type" content="application/xhtml+; charset=utf-8"/>

in Polyglot, because the XHTML parser disallows the use of
http-equiv="Content-type" [1].

So net-net I think we should ask for the following as the beginning of
Section 3 of Polyglot [2]:

   Polyglot markup uses the UTF-8 character encoding, the only character
   encoding for which both HTML and XML require support. HTML requires
   UTF-8 to be explicitly declared to avoid fallback to a legacy encoding
   [HTML5]. For XML, UTF-8 is an encoding default. As such, character
   encoding may be left undeclared in XML with the result that UTF-8 is
   still supported [XML10].

   Polyglot markup declares the UTF-8 character encoding in the following
   ways, which may be used separately or in combination:

   * Within the document
     . By using <meta charset="UTF-8"/> (the HTML encoding
        declaration) -- preferred
     . By using the Byte Order Mark (BOM) character.

   * Outside the document
     . . .

ht

[1] http://www.w3.org/TR/2011/WD-html5-20110525/semantics.html#pragma-directives
[2] http://www.w3.org/TR/2011/WD-html-polyglot-20110525/#character-encoding
-- 
       Henry S. Thompson, School of Informatics, University of Edinburgh
      10 Crichton Street, Edinburgh EH8 9AB, SCOTLAND -- (44) 131 650-4440
                Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk
                       URL: http://www.ltg.ed.ac.uk/~ht/
 [mail from me _always_ has a .sig like this -- mail without it is forged spam]
Received on Wednesday, 11 January 2012 17:52:13 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:16:43 UTC