W3C home > Mailing lists > Public > www-international@w3.org > October to December 2012

Re: byte order mark article

From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
Date: Thu, 22 Nov 2012 03:38:14 +0100
To: John Cowan <cowan@mercury.ccil.org>
Cc: Anne van Kesteren <annevk@annevk.nl>, www-international@w3.org
Message-ID: <20121122033814884911.db7439ab@xn--mlform-iua.no>
John Cowan, Wed, 21 Nov 2012 21:32:35 -0500:
> Leif Halvard Silli scripsit:
> 
>>   (Except that it, per Unicode, defaults to big endian, sorry.)
> 
> Yes.
> 
>> Well, yes. And no. Isn't the BOM part of the UTF-16 encoding? If yes, 
>> then in a way it is more correct to say that it defaults to UTF-16BE. 
> 
> That is self-contradictory.  If a BOM is present, by definition the
> encoding is not UTF-16BE or LE.

Yes. But my "use case" was a 16-bit, big-endian document *without* a 
BOM and whose HTTP Content-Type header said "charset=UTF-16". To say 
that the parser, for such a document, would default to UTF-16BE, seems 
meaningful to me.
-- 
leif halvard silli
Received on Thursday, 22 November 2012 02:38:42 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 22 November 2012 02:38:43 GMT