Re: faq suggestions from Jungshik Shin on 2004-08-23 (www-international@w3.org from July to September 2004)

From: Jungshik Shin <jshin@i18nl10n.com>
Date: Mon, 23 Aug 2004 14:55:46 +0900
To: www-international@w3.org
Message-ID: <412986E2.2020104@i18nl10n.com>

Bjoern Hoehrmann wrote:
> * Tex Texin wrote:
> 
>>The reason I stated the standard implies the switch occurs after the charset is
>>parsed is text like:
>>
>>http://www.w3.org/TR/html401/charset.html#h-5.2.2
>>
>>"The META declaration must only be used when the character encoding is
>>organized such that ASCII-valued bytes stand for ASCII characters (at least
>>until the META element is parsed). META declarations should appear as early as
>>possible in the HEAD element."
>>
>>If the document was going to be reparsed there would be less need for
>>ASCII-values to precede it.

> The need exists because the user agent must assume some base character
> encoding in order to find the <meta>. 

   I agree with Bjoern.

 > The text
> essentially means that documents that are encoded in UTF-16, EBCDIC,
> etc. and have a <meta ... Content-Type ...> and lack higher-level
> protocol encoding information are incorrect. Or that they are incorrect
> regardless of higher-level protocol information. Who knows, it's the
> HTML 4 Recommendation, it could mean anything...

'At least until ....' phrase can be interpreted to mean the following, 
too (which is similar to what Tex wrote in his first message in this 
thread) A document can be in two encodings, the first of which is ASCII 
and used until 'meta charset' declaration appears. After declaring 
charset to be UTF-16LE(EBCDIC, or whatever), it can be in UTF-16LE. This 
is rather 'crazy', but it seems like that's a possible scenario. Well, 
I'm being lazy here... I should go and find out in what context the 
above paragraph was written.

FYI, Mozilla does what XML 1.x Appendix F recommends (although it's 
non-normative), on which CSS 2.1 recommendation is more or less based 
(with some changes).

http://www.w3.org/TR/2004/REC-xml-20040204/#sec-guessing

Jungshik

Received on Monday, 23 August 2004 05:56:24 UTC