W3C home > Mailing lists > Public > www-html@w3.org > November 2006

HTML 4.01: Char encoding defaults for external scripts?

From: Claudio Pellegrino <cloaked01@claudio-pellegrino.de>
Date: Wed, 1 Nov 2006 18:27:10 +0100
Message-Id: <F12CBC6D-58EF-4750-B954-D049F0D11482@claudio-pellegrino.de>
To: www-html@w3.org

Hello,


I have a small problem with interpreting part of the HTML 4.01 specs.
It is about character encoding assumptions from the user agent point  
of view.


Assume the following scenario:

1. Let doc.html be an HTML 4.01 document served as "text/html".
2. The content of doc.html is encoded with UTF-8, and
    the meta element declaration correctly says so.
3. The document also contains a "script" element having
    type="text/javascript" and src="ext.js", but no
    "charset" attribute.
4. The HTTP server does not serve a "charset" header,
    neither for "doc.html" nor for "ext.js".


Given the above scenario, here are my two questions:

According to the HTML 4.01 specs, is the user agent
a) *required* to assume a specific character encoding
    for the content of "ext.js"? If so, which one and
    why?
b) *allowed* to ignore the list of three encoding
    priorities at [2] and use heuristics instead
    for determining the content encoding of "ext.js"?


Some reasons why I'm clueless about it:
- The "script" element specification [1] doesn't
   specify a default behaviour for the user agent.
- The page about encodings [2] cites a list of three
   encodings which may be applicable for "doc.html" -
   but not necessarily for "ext.js" since the page
   only mentions the "document's character encoding".
- The "charset" attribute description [3] doesn't
   specify any implied default value or behaviour.
- I see similar problems in XHTML 1.0 [4].
- I feel I have checked with the appropriate sources,
   including all the links within [5] and, of course,
   the list archive, but found little more than
   authoring advice ...

Is there a part of the specs I have missed?
Any pointer would be appreciated. Thanks a lot!


Regards,
Claudio Pellegrino


[1] http://www.w3.org/TR/html4/interact/scripts.html#h-18.2.1
[2] http://www.w3.org/TR/html4/charset.html#h-5.2.2
[3] http://www.w3.org/TR/html4/struct/links.html#adef-charset
[4] http://www.w3.org/TR/xhtml1/#C_9
[5] http://www.w3.org/International/articles/#chars
Received on Thursday, 2 November 2006 02:36:59 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 March 2012 18:16:08 GMT