- From: Henri Sivonen <hsivonen@iki.fi>
- Date: Sun, 1 Jun 2008 16:45:26 +0300
The HTML5 draft says that authors should not use EBCDIC-based encodings. This is more lax than saying that authors must not use and user agents must not support CESU-8, UTF-7, BOCU-1 and SCSU. In general, now that UTF-8 exists and is ubiquitously supported, proliferation of encodings is costly and doesn't expand that the expressiveness of HTML which is parsed into a Unicode DOM anyway. Moreover, encodings that are not ASCII supersets are potential security risks since the string "<script>" may be represented by different bytes than in ASCII leading to potential privilege escalation if a server-side gatekeeper and a user agent give different meanings to the bytes. For these reasons, if EBCDIC-based encodings don't need to be supported in order to Support Existing Content, it would be beneficial never to add support for them and, thus, ban them like CESU-8, UTF-7, BOCU-1 and SCSU. I asked Hixie for examples of sites or browsers that require/support EBCDIC-based encodings. He had none. I examined the encoding menus of Firefox 3b5, Safari 3.1 and Opera 9.5 beta (on Leopard) and IE8 beta 1 (on English XP SP3). None of them expose EBCDIC-based encodings in the UI. (All the IBM encodings Firefox exposes turn out to be ASCII-based.) This makes me wonder: Do the top browsers support any EBCDIC-based encodings but just without exposing them in the UI? If not, can there be any notable EBCDIC-based Web content? I'm suspecting that EBCDIC isn't actually a Web-relevant. -- Henri Sivonen hsivonen at iki.fi http://hsivonen.iki.fi/
Received on Sunday, 1 June 2008 06:45:26 UTC