W3C home > Mailing lists > Public > www-html@w3.org > November 2006

(Solution) HTML 4.01: Char encoding defaults for external scripts?

From: Claudio Pellegrino <cloaked01@claudio-pellegrino.de>
Date: Mon, 6 Nov 2006 09:50:45 +0100
Message-Id: <853F2C72-BEAB-4A5A-9B3D-E9E27C1109C4@claudio-pellegrino.de>
To: www-html@w3.org

Hello,


[original scenario]
> 1. Let doc.html be an HTML 4.01 document served as "text/html".
> 2. The content of doc.html is encoded with UTF-8, and
>    the meta element declaration correctly says so.
> 3. The document also contains a "script" element having
>    type="text/javascript" and src="ext.js", but no
>    "charset" attribute.
> 4. The HTTP server does not serve a "charset" header,
>    neither for "doc.html" nor for "ext.js".


Thanks to James Justinís answer, I see a possible explanation now.

Iíll try to sum it up myself:

- Authoring a script ref. to a non-US-ASCII external resource REQUIRES
   the author to specify a charset (as RFC 2046 demands). This can be
   done either way: embedded in the "type" attribute or by specifying
   a separate "charset" attribute within the script element.

- In particular, declaring a character encoding *for the
   entire document* is not sufficient to fulfill the above requirement.
   User agents are not REQUIRED to propagate, for example, a charset
   declaration given in the META element all the way down to the content
   of the external script resource.

- However, the user agent MAY take the document encoding (e. g. from the
   META element) as a fallback/hint. It seems to me that most
   (but not all) popular browsers do exactly that.
   (This is what actually raised my question in the first place.)

- Lesson learned: Given the scenario, the user agent behaviour
   is not specified according to HTML 4.01.

I hope this explanation is roughly correct.


Regards
Claudio Pellegrino
Received on Monday, 6 November 2006 08:50:56 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 March 2012 18:16:08 GMT