charset & language priorities

In the HTML spec/recommendations, a different priority order is given for
language encodings than character sets for user agent
interpretation....  I'd like to understand why - in particular, I'd like to
understand the reasoning behind the priority for charset since
it is (I think) counter-intuitive...  Especially given the priority for
language encoding.

Here's the two excerpts that I am referring to:

Spec:  Section 8.1.2 Inheritance of language codes
An element inherits language code information according to the following
order of precedence (highest to lowest):
- the lang attribute set for the element itself
- the closest parent element that has the lang attribute set
- the HTTP "Content-Language" header

Rec:  Section 5.2.2 Specifying Character Encoding
To sum up, conforming user agents must observe the following priorities
when determining a document's character encoding (from
highest priority to lowest):
1. An HTTP "charset" parameter in a "Content-Type" field
2. A META declaration with "http-equiv" set to "Content-Type" and a value
set for "charset".
3. The charset attribute set on an element that designates an external
resource.

It seems to me that the document writer has more knowledge on the actual
content than the server does, so is more likely to be
accurate - hence should be given higher priority....




Kristi Schultz - Internet:  kristis@us.ibm.com
System Chief Engineering Manager - IBM iSeries Software
(507)253-2177  t/l 553-2177 Fax: (507)253-0335
------------------------------------
"Now faith is the assurance of things hoped for, the conviction of things
not seen"
  Hebrews 11:1

Received on Thursday, 25 July 2002 13:57:07 UTC