- From: Henry S. Thompson <ht@inf.ed.ac.uk>
- Date: Wed, 11 Jan 2012 17:51:48 +0000
- To: public-xml-core-wg@w3.org
ht writes: > 1) It recommends [1] the use of the UTF-8 BOM -- that seems . . . odd > to me. OK, I've done some further checking, and we can't have either <meta http-equiv="Content-type" content="text/html; charset=utf-8"/> or <meta http-equiv="Content-type" content="application/xhtml+; charset=utf-8"/> in Polyglot, because the XHTML parser disallows the use of http-equiv="Content-type" [1]. So net-net I think we should ask for the following as the beginning of Section 3 of Polyglot [2]: Polyglot markup uses the UTF-8 character encoding, the only character encoding for which both HTML and XML require support. HTML requires UTF-8 to be explicitly declared to avoid fallback to a legacy encoding [HTML5]. For XML, UTF-8 is an encoding default. As such, character encoding may be left undeclared in XML with the result that UTF-8 is still supported [XML10]. Polyglot markup declares the UTF-8 character encoding in the following ways, which may be used separately or in combination: * Within the document . By using <meta charset="UTF-8"/> (the HTML encoding declaration) -- preferred . By using the Byte Order Mark (BOM) character. * Outside the document . . . ht [1] http://www.w3.org/TR/2011/WD-html5-20110525/semantics.html#pragma-directives [2] http://www.w3.org/TR/2011/WD-html-polyglot-20110525/#character-encoding -- Henry S. Thompson, School of Informatics, University of Edinburgh 10 Crichton Street, Edinburgh EH8 9AB, SCOTLAND -- (44) 131 650-4440 Fax: (44) 131 650-4587, e-mail: ht@inf.ed.ac.uk URL: http://www.ltg.ed.ac.uk/~ht/ [mail from me _always_ has a .sig like this -- mail without it is forged spam]
Received on Wednesday, 11 January 2012 17:52:13 UTC