- From: Anne van Kesteren <annevk@opera.com>
- Date: Fri, 30 Dec 2011 11:54:34 +0100
On Fri, 30 Dec 2011 05:51:16 +0100, Leif Halvard Silli <xn--mlform-iua at m?lform.no> wrote: > The Trident cache behaviour is a symptom of its over all UTF-16 > behaviour: Apart from reading the BOM, it doesn't do any UTF-16 > sniffing. I suspect that you want Opera/Firefox to become "as bad" at > 'getting' the UTF-16 encoding as Webkit/IE are? (Note that Webkit is > worse than IE - just to, once again, emphasize how difficult it is to > replicate IE.) How is WebKit worse than IE? And why should there be UTF-16 sniffing? > But is the little endian defaulting really important? > Over all, proper UTF-16 treatment (read: sniffing) on IE/WEbkit's part, > would probably improve the situation more. You mean there are sites that only work in Gecko/Presto? > I know ... And it precisely therefore that it would have been an > advantage to, for the Web, focus on *requiring* the BOM for UTF-16. It seems simpler to focus on promoting only UTF-8. >> Yeah, I'm going to file a new bug so we can reconsider although the >> octet sequence the various BOMs represent can have legitimate meanings >> in >> certain encodings, > > You mean: In addition to the BOM meaning, I suppose. No. In e.g. windows-1258 there is no BOM and FF FE simply means U+00FF U+20AB. >> it seems in practice people use them for Unicode. >> (Helped by the fact that Trident/WebKit behave this way of course.) > > Don't forget the fact that Presto/Gecko do not move the BOM into the > <body> when you use UTF-16LE/BE, like they - per the spec of those > encodings - should do. See: > <http://bugzilla.validator.nu/show_bug.cgi?id=890> Well yes, that's why I'm planning to define utf-16 more in line with implementations (and render the current text obsolete I suppose). -- Anne van Kesteren http://annevankesteren.nl/
Received on Friday, 30 December 2011 02:54:34 UTC