- From: Ian Hickson <ian@hixie.ch>
- Date: Wed, 18 Feb 2004 02:51:20 +0000 (UTC)
- To: Ernest Cline <ernestcline@mindspring.com>
- Cc: Bert Bos <bert@w3.org>, www-style@w3.org
On Tue, 17 Feb 2004, Ernest Cline wrote: > > What about CESU-8 (from UTF#26)? It shares the same BOM as UTF-8, > so only the HTTP header or the @charset rule can distinguish them. > (UTF#26 explicitly bars attempting to determine that the encoding is > CESU-8 by auto detection.) Oops, forgot about CESU-8. However, given CESU-8's extremely low status, and strong wording in its specification against it being used for information exchange, I feel it is of little more than academic concern. If the stylesheet said: [UTF-8 BOM]@charset "CESU-8"; ...then the case is unambiguous (it's CESU-8). If there is no way to detect between CESU-8 and UTF-8 in a particular document (likely for many utf-8 cases, I guess) then the algorithm falls to step 6, "UA-dependent mechanisms", and compliant UAs would then default to UTF-8 (since they aren't allowed to auto-detect CESU-8). -- Ian Hickson )\._.,--....,'``. fL U+1047E /, _.. \ _\ ;`._ ,. http://index.hixie.ch/ `._.-(,_..'--(,_..'`-.;.'
Received on Tuesday, 17 February 2004 21:51:26 UTC