- From: Frank Ellermann <hmdmhdfmhdjmzdtjmzdtzktdkztdjz@gmail.com>
- Date: Sun, 27 Jan 2008 15:10:40 +0100
- To: "Henri Sivonen" <hsivonen@iki.fi>
- Cc: <public-html-comments@w3.org>
Henri Sivonen wrote: >> UTF-8 is significantly less compact than SCSU/BOCU for most >> peoples' native languages. > For such arguments, gzip should always be considered and the > compatibility benefits of UTF-8 + gzip be appreciated. +1 For details see http://unicode.org/notes/tn14/ Similar UTF-8 is less compact than "UTF-4" for all languages roughly covered by Latin-1 (excl. C1 controls), and arguably "UTF-4" could be considered as "better than windows-1252". But that's beside the point for XHTML, where I can simply use Latin-1 or windows-1252, and get any other code point as NCR, many browsers support this. Some very old browsers insist on decimal NCRs, and how far their fonts support any other code points is a different question, to some degree it works. "UTF-4" would work nowhere today, and if it's ever published formally it would come with a MUST NOT for XML. Which brings us back to the MUST NOT about BOCU-1 and SCSU: The HTML5 spec. needs compelling reasons for MUST NOT, and also for SHOULD NOT. I understand SHOULD NOT as "you need a very good excuse to ignore it", one standard good excuse is "implemented before the SHOULD NOT". If there are other good excuses the spec. has to say what they could be, otherwise folks could claim that "I want it" is a good enough excuse. Frank
Received on Sunday, 27 January 2008 14:10:32 UTC