W3C home > Mailing lists > Public > www-html@w3.org > November 2006

Re: HTML 4.01: Char encoding defaults for external scripts?

From: Claudio Pellegrino <cloaked01@claudio-pellegrino.de>
Date: Mon, 6 Nov 2006 09:47:18 +0100
Message-Id: <3F7E5F91-3EFD-4432-85EC-0E1BCB24E623@claudio-pellegrino.de>
Cc: James Justin Harrell <herorev@yahoo.com>
To: www-html@w3.org

Hello, James Justin!

>> 3. The document also contains a "script" element
>> having
>>     type="text/javascript" and src="ext.js", but no
>>     "charset" attribute.
> The HTML 4.01 specification [...] references RFC2046
> (http://tools.ietf.org/html/rfc2046) which specifies
> that text media types (text/***) should be treated as
> US-ASCII by default.

It seems you are correct. I didnít know about this RFC - thanks for  
the pointer.
Also, on the next page it seems to address the very problem:

> [Pages 7-8] This also implies that [...] 8bit or multiple
> octet character encodings MUST use an appropriate character set
> specification to be consistent with MIME.

In other words: my scenario "breaks" MIME conformance because it  
links to a UTF-8 resource but fails to specify the desired charset  
*within the script element*.

> However, [...] it's strange that
> the script element has a charset attribute since MIME
> types can also specify a character encoding.
> example:
>   text/javascript; charset=iso-8859-1

Hmm yes, this seems redundant in a way.
However, I am strongly convinced that the reason to introduce the  
HTML attribute was to give authors a "syntactically cleaner" way to  
specify their character encoding.

In other words, specifying charset="iso-8859-1" is probably  
semantically equivalent to your example, possibly even overriding it  
if both are specified at the same time.

Thanks again for your time and answer.
Itís nice to see my newbie question getting a helpful reply, and I  
have never posted to this list before :)

Iíll try to sum up the solution in a separate reply.

Claudio Pellegrino
Received on Monday, 6 November 2006 08:47:44 UTC

This archive was generated by hypermail 2.4.0 : Thursday, 30 April 2020 16:21:01 UTC