- From: Frank Ellermann <hmdmhdfmhdjmzdtjmzdtzktdkztdjz@gmail.com>
- Date: Fri, 25 Jan 2008 15:35:23 +0100
- To: <public-html-comments@w3.org>
Hi, the chapter about "acceptable" charsets (8.2.2.2) is messy. Clearly UTF-8 and windows-1252 are popular, and you have that. What you need as a "minimum" for new browsers is UTF-8, US-ASCII (as popular proper subset of UTF-8), ISO-8859-1 (as HTML legacy), and windows-1252 for the reasons stated in the draft, supporting Latin-1 but not windows-1252 would be stupid. BTW, I'm not aware that windows-1252 is a violation of CHARMOD, I asked a question about it and C049 in a Last Call of CHARMOD. Please s/but may support more/but should support more/ - the minimum is only that, the minimum. | User agents must not support the CESU-8, UTF-7, BOCU-1 and SCSU | encodings I can see a MUST NOT for UTF-7 and CESU-8. And IMO the only good excuse for legacy charsets is backwards compatibility. But that is at worst a "SHOULD NOT" for BOCU-1, as you have it for UTF-32. I refuse to discuss SCSU, but MUST NOT is rather harsh, isn't it ? In 3.7.5.4 you say: | Authors should not use JIS_X0212-1990, x-JIS0208, and encodings | based on EBCDIC. Authors should not use UTF-32. What's the logic behind these recommendations ? Of course EBCDIC is rare (as far as HTML is concerned I've never seen it), but it's AFAIK not worse than codepage 437, 850, 858, or similar charsets. And UTF-32 is relatively harmless, not much worse than UTF-16, it belongs to the charsets recommended in CHARMOD. Depending on what happens in future Unicode versions banning UTF-32 could backfire. There are lots of other charsets starting with UTF-1 that could be listed as SHOULD NOT or even MUST NOT. Whatever you pick, state what your reasons are, not only the (apparently) arbitrary result. Please make sure that all *unregistered* charsets are SHOULD NOT. Yes, I know the consequences for some proprietary charsets, they are free to register them or to be ignored (CHARMOD C022). Frank
Received on Friday, 25 January 2008 14:35:20 UTC