byte order mark article

I saw http://www.w3.org/International/questions/new/qa-byte-order-mark-new
in the minutes.

* This article mentions utf-16 a lot. Given the pain utf-16 causes
being the only non-ASCII-compatible encoding user agent implementors
have to care about and that there's even talk about maybe trying to
get rid of it completely, featuring it so prominently seems unwise.
You might want to get Henri's view on this too.

* Are there even non-recent versions of major browsers that do not
handle the byte order mark? How far back do we have to go these days?

* Per my reading of the HTML specification you can use utf-16le and
utf-16be without a BOM. It does not even require it for utf-16,
although I suppose Unicode might (though Unicode is not very correct
here with respect to what implementations do). So the section "If you
use UTF-16" seems wrong.

* "According to the HTML specification, the HTTP header overrides any
in-document encoding." is no longer true.

* "A UTF-8 signature at the beginning of a CSS file can sometimes
cause the initial rules in the file to fail on certain user agents."
citation needed? :-)

* If you really have to mention utf-32, you might also want to point
out it has been actively removed from implementations so using it is
unlikely to be productive.


-- 
http://annevankesteren.nl/

Received on Wednesday, 21 November 2012 21:04:50 UTC