Re: byte order mark article from Leif Halvard Silli on 2012-11-22 (www-international@w3.org from October to December 2012)

From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
Date: Thu, 22 Nov 2012 03:38:14 +0100
To: John Cowan <cowan@mercury.ccil.org>
Cc: Anne van Kesteren <annevk@annevk.nl>, www-international@w3.org
Message-ID: <20121122033814884911.db7439ab@xn--mlform-iua.no>

John Cowan, Wed, 21 Nov 2012 21:32:35 -0500:
> Leif Halvard Silli scripsit:
> 
>>   (Except that it, per Unicode, defaults to big endian, sorry.)
> 
> Yes.
> 
>> Well, yes. And no. Isn't the BOM part of the UTF-16 encoding? If yes, 
>> then in a way it is more correct to say that it defaults to UTF-16BE. 
> 
> That is self-contradictory.  If a BOM is present, by definition the
> encoding is not UTF-16BE or LE.

Yes. But my "use case" was a 16-bit, big-endian document *without* a 
BOM and whose HTTP Content-Type header said "charset=UTF-16". To say 
that the parser, for such a document, would default to UTF-16BE, seems 
meaningful to me.
-- 
leif halvard silli

Received on Thursday, 22 November 2012 02:38:42 UTC