Re: HTML 4.01: Char encoding defaults for external scripts?

Hello, James Justin!


>> 3. The document also contains a "script" element
>> having
>>     type="text/javascript" and src="ext.js", but no
>>     "charset" attribute.
> The HTML 4.01 specification [...] references RFC2046
> (http://tools.ietf.org/html/rfc2046) which specifies
> that text media types (text/***) should be treated as
> US-ASCII by default.

It seems you are correct. I didn’t know about this RFC - thanks for  
the pointer.
Also, on the next page it seems to address the very problem:

> [Pages 7-8] This also implies that [...] 8bit or multiple
> octet character encodings MUST use an appropriate character set
> specification to be consistent with MIME.

In other words: my scenario "breaks" MIME conformance because it  
links to a UTF-8 resource but fails to specify the desired charset  
*within the script element*.


> However, [...] it's strange that
> the script element has a charset attribute since MIME
> types can also specify a character encoding.
>
> example:
>   text/javascript; charset=iso-8859-1

Hmm yes, this seems redundant in a way.
However, I am strongly convinced that the reason to introduce the  
HTML attribute was to give authors a "syntactically cleaner" way to  
specify their character encoding.

In other words, specifying charset="iso-8859-1" is probably  
semantically equivalent to your example, possibly even overriding it  
if both are specified at the same time.


Thanks again for your time and answer.
It’s nice to see my newbie question getting a helpful reply, and I  
have never posted to this list before :)

I’ll try to sum up the solution in a separate reply.


Regards
Claudio Pellegrino

Received on Monday, 6 November 2006 08:47:44 UTC