- From: Jungshik Shin <jshin@i18nl10n.com>
- Date: Mon, 23 Aug 2004 14:55:46 +0900
- To: www-international@w3.org
Bjoern Hoehrmann wrote: > * Tex Texin wrote: > >>The reason I stated the standard implies the switch occurs after the charset is >>parsed is text like: >> >>http://www.w3.org/TR/html401/charset.html#h-5.2.2 >> >>"The META declaration must only be used when the character encoding is >>organized such that ASCII-valued bytes stand for ASCII characters (at least >>until the META element is parsed). META declarations should appear as early as >>possible in the HEAD element." >> >>If the document was going to be reparsed there would be less need for >>ASCII-values to precede it. > The need exists because the user agent must assume some base character > encoding in order to find the <meta>. I agree with Bjoern. > The text > essentially means that documents that are encoded in UTF-16, EBCDIC, > etc. and have a <meta ... Content-Type ...> and lack higher-level > protocol encoding information are incorrect. Or that they are incorrect > regardless of higher-level protocol information. Who knows, it's the > HTML 4 Recommendation, it could mean anything... 'At least until ....' phrase can be interpreted to mean the following, too (which is similar to what Tex wrote in his first message in this thread) A document can be in two encodings, the first of which is ASCII and used until 'meta charset' declaration appears. After declaring charset to be UTF-16LE(EBCDIC, or whatever), it can be in UTF-16LE. This is rather 'crazy', but it seems like that's a possible scenario. Well, I'm being lazy here... I should go and find out in what context the above paragraph was written. FYI, Mozilla does what XML 1.x Appendix F recommends (although it's non-normative), on which CSS 2.1 recommendation is more or less based (with some changes). http://www.w3.org/TR/2004/REC-xml-20040204/#sec-guessing Jungshik
Received on Monday, 23 August 2004 05:56:24 UTC