- From: Tex Texin <tex@i18nguy.com>
- Date: Sun, 22 Aug 2004 20:34:07 -0700
- To: Bjoern Hoehrmann <derhoermi@gmx.net>
- CC: www-international@w3.org
thanks for the info Bjoern. That is interesting that IE only reparses from the current chunk. The reason I stated the standard implies the switch occurs after the charset is parsed is text like: http://www.w3.org/TR/html401/charset.html#h-5.2.2 "The META declaration must only be used when the character encoding is organized such that ASCII-valued bytes stand for ASCII characters (at least until the META element is parsed). META declarations should appear as early as possible in the HEAD element." If the document was going to be reparsed there would be less need for ASCII-values to precede it. Also if the document is reparsed from the beginning, what happens if the page is encoded in an ebcdic encoding? If the page is ebcdic from the first byte, then the meta charset statement won't be parsable... However, CSS 2.1 is a bit better and inline with your and Jungshik's ideas. http://www.w3.org/TR/CSS21/syndata.html#q23 "Note that reliance on the @charset construct theoretically poses a problem since there is no a priori information on how it is encoded. In practice, however, the encodings in wide use on the Internet are either based on ASCII, UTF-16, UCS-4, or (rarely) on EBCDIC. This means that in general, the initial byte values of a style sheet enable a user agent to detect the encoding family reliably, which provides enough information to decode the @charset rule, which in turn determines the exact character encoding." The @charset statement must be the first in the CSS file, and clearly the spec expects the UA to make enough of a determination of the encoding of the file to be able to confirm it exactly by parsing the @charset value. tex Bjoern Hoehrmann wrote: > > * Tex Texin wrote: > >With respect to user agents reparsing documents from the beginning, can you say > >which ones do this? > > Internet Explorer for Windows re-parses the chunk in which the <meta> > element was found (a chunk is usually a block of 8 KB), Mozilla re- > parses all the chunks, that's at least what I remember from tests. > You can test such things using a <title> element prior to the <meta> > element, for example. > > >They are not obligated to and the wording of the standards implies that the > >encoding "switch" from the initial value to the value specified in the charset > >statement, occurs at the point the statement is parsed. > > That's not clear to me at all... -- ------------------------------------------------------------- Tex Texin cell: +1 781 789 1898 mailto:Tex@XenCraft.com Xen Master http://www.i18nGuy.com XenCraft http://www.XenCraft.com Making e-Business Work Around the World -------------------------------------------------------------
Received on Monday, 23 August 2004 03:35:15 UTC