W3C home > Mailing lists > Public > www-validator@w3.org > November 2002

Re: Beta: Charset attribute

From: Terje Bless <link@pobox.com>
Date: Fri, 8 Nov 2002 23:50:13 +0100
To: W3C Validator <www-validator@w3.org>
Message-ID: <a01060005-1021-7137B8DAF36C11D6A69400039300CF5C@[193.157.66.10]>

Lloyd Wood <l.wood@eim.surrey.ac.uk> wrote:

>I ws thinking of this as a generic get-out for the Content-Type: META
>tag dilemma. A server should able to explicitly tell a client to obey
>the stuff in the page.

In essence, this is what the various application/*+xml media types do; they
say "this is some kind of XML, dispatch on DOCTYPE and/or Namespace and use
XML heuristics to determine encoding". IOW, they've moved the type and
encoding determination in-band instead of relying on the established
out-of-band methods provided by HTTP (and several decades worth of
experience from other systems (email and MIME chief among them)).

[ While my opinion on /that/ issue should be perfectly obvious, lets    ]
[ not get sidetracked into a long and fruitless debate on it's relative ]
[ merits. Having "Been There, Done That", repeatedly, and all.. :-)     ]

As for text/html, the essential problem with delegating the responsibility
for conveying this information to various in-band hacks, is the old
"Catch-22" of trying to determine the type and encoding of an object by
looking /inside/ the object before you know either the type or the
encoding.

And frankly, if you can force the use of text/whatever-it-says-dude, you
could equally well set the _correct_ information in the first place.


Or am I completely missing your point...? :-)




Hmmm. It suddenly occurs to me that the concept of "Tentative" Validity in
an uploaded file may be completely bogus. For something on the web it's
intent is to make sure people don't set doctype/charset override and then
get told the doc is valid even though we've checked a completely different
set of conditions (new doctype, say).

But for an uploaded file, at least the character set will _never_ bear any
kind of relation to what it /would/ be when served from a web server (this
much has been mentioned before on the list, BTW). Perhaps the most
constructive way to deal with it is to _always_ require the charset
"override" to be set for file uploads?

The Content-Type may warrant a similar treatment, but probably not the
doctype. I think... Maybe...


Will have to think a bit about this at some point...

-- 
"Allright... Calm down! Relax! Start breathin´..."         -- Dr. D.R.E.
Received on Friday, 8 November 2002 17:50:17 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:04 GMT