- From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
- Date: Thu, 22 Nov 2012 23:34:55 +0100
- To: Richard Ishida <ishida@w3.org>
- Cc: Anne van Kesteren <annevk@annevk.nl>, www-international@w3.org
Richard Ishida, Thu, 22 Nov 2012 18:00:02 +0000: > On 21/11/2012 21:04, Anne van Kesteren wrote: >> * Are there even non-recent versions of major browsers that do not >> handle the byte order mark? How far back do we have to go these days? >> >> * Per my reading of the HTML specification you can use utf-16le and >> utf-16be without a BOM. > > Actually, RFC2781 says that you MUST NOT use a BOM with content > labelled as utf-16le/utf-16be. (Of course, as mentioned in the side > note int he article, this is about labeling rather than the sequence > of bytes at the start of the file.) > > It does not even require it for utf-16, Right:[1] ]] MUST label the text as "UTF-16", and SHOULD make sure the text starts with 0xFEFF.[[ The perceived requirement to use the BOM might stem from XML: [1] ]] An exception to the "SHOULD" rule of using "UTF-16BE" or "UTF-16LE" would occur with document formats that mandate a BOM in UTF-16 text, thereby requiring the use of the "UTF-16" tag only.[[ > This was news to me. I believe HTML5 did the last time I looked. I've > made several changes to reflect this. And then the Encoding Standard confusingly says:[2] "In violation of the Unicode standard, "utf-16" is a label for utf-16le rather than its own standalone encoding." Which gives the impression that the Encoding Standard *only* switches the labels. However, in the Encoding Standard, the terms 'UTF-16LE'/'UTF-16BE' also covers the user of the BOM - which the 'UTF-16LE'/'UTF-16BE' labels per "the other standards" do not cover. I would suggest to Anne that the Encoding Standard should make clear that the terms 'UTF-16LE'/'UTF-16BE' are in violation with the existing definitions of hose labels. [1] http://tools.ietf.org/html/rfc2781#section-3.3 [2] http://encoding.spec.whatwg.org/#utf-16le -- leif halvard silli
Received on Thursday, 22 November 2012 22:35:28 UTC