W3C home > Mailing lists > Public > www-validator@w3.org > July 2001

Re: charset parameter (fwd)

From: Nick Kew <nick@webthing.com>
Date: Thu, 26 Jul 2001 13:11:07 +0100 (BST)
To: www-validator@w3.org
Message-ID: <Pine.BSF.4.21.0107261245130.2100-100000@fenris.webthing.com>
> >Surely that at least is clear: [HTTP] takes precedence over [META]?
> 
> Nope. HTTP 1.1 doesn't mention META,

Of course not.

>	 and HTML just sez it's supposed to be
> read by _servers_ to initialize the HTTP header... :-(

"supposed to be"?  Oh dear.

> >But *ML rules don't apply to HTTP, so whence the conclusion that
> >*anything* is implicit (as opposed to absent) in the headers?
> 
> The lack of a "charset" parameter on the HTTP 1.1 "Content-Type" header
> field means that you should assume it is there with a value of "ISO-889-1"
> according to the HTTP 1.1 RFC.

Oh dear.  Doesn't that logic give us:

Content-Type: image/png; charset=iso-8859-1

or even
Content-Type: application/x-hyperlens-object; charset=iso-8859-1

> That is, if the META sez EUC-JP and HTTP implicitly defines ISO-8859-1 (by
> being absent), does that really mean that we should use ISO-8859-1 (which
> the user obviously does _not_ want) over EUC-JP (which s/he _does_ want)?

Only if we accept "HTTP implicitly defines ..."  Now when HTTP explicitly
defines something, we accept it.

We can use the underlying fallacy of the <META> to get an instant
contradiction:

<meta http-equiv="content-type" contents="text/plain">

> >that might be considered as a sefault (certainly iso-8859-1 and utf-8)?
> >so that a document that validates to it should always be fine?
> 
> This is again very much Western thinking. US-ASCII is a subset only of
> common Western encodings. This means the answer to your question depends on
> whether you accept the validity of these "defaulted" charset parameters.

Well yes - US-ASCII isn't fully up to English, let alone other languages.
But we are only talking a default.

> What does Site Valet do?

Site valet is lots of tools, more than one of which process markup;-)

Code Valet does exactly what you tell it in the form.

Page Valet has very limited i18n support, and will use a documents
declared charset or default to iso-8859-1 if none is declared.

cg-eye's validation is similar to page valet, but more out-of-date
(reminder to self ...)

Tidy Online follows Dave Raggett's original in defaulting to ASCII,
but allows you to change it with the tidyconf utility.

-- 
Nick Kew
Received on Thursday, 26 July 2001 08:11:22 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:13:59 GMT