W3C home > Mailing lists > Public > www-international@w3.org > July to September 2010

Re: UTF-16, UTF-16BE and UTF-16LE in HTML5

From: Simon Pieters <simonp@opera.com>
Date: Thu, 02 Sep 2010 17:40:31 +0200
To: public-html@w3.org, www-international@w3.org, "Richard Ishida" <ishida@w3.org>
Message-ID: <op.viewhtlzidj3kv@simon-pieterss-macbook.local>
On Mon, 26 Jul 2010 20:52:09 +0200, Richard Ishida <ishida@w3.org> wrote:

> HTML5 says:
> "If an HTML document does not start with a BOM, and if its encoding is  
> not
> explicitly given by Content-Type metadata, and the document is not an  
> iframe
> srcdoc document, then the character encoding used must be an
> ASCII-compatible character encoding..."
> http://dev.w3.org/html5/spec/semantics.html#charset
> This rules out the use of UTF-16BE and UTF16-LE character encodings,  
> since
> they should not start with a BOM.

My reading is that UTF-16BE adn UTF-16LE aren't ruled out, just that they  
need to be specified with Content-Type. They could be discouraged (like  
UTF-32) or banned (like UTF-7), but I don't have much of an opinion on the  

Simon Pieters
Opera Software
Received on Thursday, 2 September 2010 15:41:09 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:40:58 UTC