W3C home > Mailing lists > Public > www-validator@w3.org > July 2001

Re: charset parameter (fwd)

From: Nick Kew <nick@webthing.com>
Date: Thu, 26 Jul 2001 13:11:07 +0100 (BST)
To: www-validator@w3.org
Message-ID: <Pine.BSF.4.21.0107261245130.2100-100000@fenris.webthing.com>
> >Surely that at least is clear: [HTTP] takes precedence over [META]?
> Nope. HTTP 1.1 doesn't mention META,

Of course not.

>	 and HTML just sez it's supposed to be
> read by _servers_ to initialize the HTTP header... :-(

"supposed to be"?  Oh dear.

> >But *ML rules don't apply to HTTP, so whence the conclusion that
> >*anything* is implicit (as opposed to absent) in the headers?
> The lack of a "charset" parameter on the HTTP 1.1 "Content-Type" header
> field means that you should assume it is there with a value of "ISO-889-1"
> according to the HTTP 1.1 RFC.

Oh dear.  Doesn't that logic give us:

Content-Type: image/png; charset=iso-8859-1

or even
Content-Type: application/x-hyperlens-object; charset=iso-8859-1

> That is, if the META sez EUC-JP and HTTP implicitly defines ISO-8859-1 (by
> being absent), does that really mean that we should use ISO-8859-1 (which
> the user obviously does _not_ want) over EUC-JP (which s/he _does_ want)?

Only if we accept "HTTP implicitly defines ..."  Now when HTTP explicitly
defines something, we accept it.

We can use the underlying fallacy of the <META> to get an instant

<meta http-equiv="content-type" contents="text/plain">

> >that might be considered as a sefault (certainly iso-8859-1 and utf-8)?
> >so that a document that validates to it should always be fine?
> This is again very much Western thinking. US-ASCII is a subset only of
> common Western encodings. This means the answer to your question depends on
> whether you accept the validity of these "defaulted" charset parameters.

Well yes - US-ASCII isn't fully up to English, let alone other languages.
But we are only talking a default.

> What does Site Valet do?

Site valet is lots of tools, more than one of which process markup;-)

Code Valet does exactly what you tell it in the form.

Page Valet has very limited i18n support, and will use a documents
declared charset or default to iso-8859-1 if none is declared.

cg-eye's validation is similar to page valet, but more out-of-date
(reminder to self ...)

Tidy Online follows Dave Raggett's original in defaulting to ASCII,
but allows you to change it with the tidyconf utility.

Nick Kew
Received on Thursday, 26 July 2001 08:11:22 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:30:31 UTC