[whatwg] UTF-16 encoding default

Have you checked for a byte order marker in the source document? (see http://unicode.org/faq/utf_bom.html#BOM 
  )

--Oliver

On Jun 23, 2009, at 6:42 PM, Kartikaya Gupta wrote:

> There's a page (http://www.microsoft.com/windowsmobile/mobile/en-us/totalaccess/software/software/eula-sw-netflix.mspx 
>  specifically) that has a Content-Type header of "text/html;  
> charset=utf-16" and has no BOM. The references I've seen (RFC2781,  
> as well as http://unicode.org/faq/utf_bom.html#gen7) say that this  
> means the content should be assumed to be UTF-16BE. The page,  
> however, is actually in UTF-16LE.
>
> All browsers seem to do some sort of unspecified magic and figure  
> out that the page is in LE. I was wondering if that magic could be  
> described and added to the HTML5 spec so that it covers rendering  
> the above page as expected. According to the draft spec as it  
> stands, I believe that page should be rendered as garbage.
>
> Cheers,
> kats
>
> PS - the page also has a meta tag that says the charset is  
> iso-8859-1. *sigh*

Received on Tuesday, 23 June 2009 19:03:55 UTC