Re: possible bug in the validator?

On Wed, 4 Sep 2002, Lloyd Wood wrote:

> On Wed, 4 Sep 2002, Liam Quinn wrote:
>
> > On Wed, 4 Sep 2002, Lloyd Wood wrote:
> >
> > > On Wed, 4 Sep 2002, Olivier Thereaux wrote:
> > >
> > > > On Wed, Sep 04, 2002, Lloyd Wood wrote:
> > > > > > Your server is sending the header
> > > > > > Content-Type: text/html; charset=us-ascii
> > > > > > which overrides the charset specified within the HTML document.
> > > > >
> > > > > Surely the charset in the document should take precedence?
> > > >
> > > > No, see http://www.w3.org/TR/REC-html40/charset.html#h-5.2.2
> > >
> > > Thanks.
> > >
> > > I can't decide if that's a very subtle way to get Content-Type used
> > > properly, or just very very broken.
> >
> > Well, consider a server (such as the original poster's) that transcodes
> > HTML pages on-the-fly according to the capabilities of the client.  It's
> > trivial for the server to set the charset in the HTTP header, but changing
> > a <meta> tag within the HTML document is much more difficult, especially
> > when almost all HTML documents are invalid.
>
> In your example, how does the server know what charset to transcode
> the page _from_?

The server could use something like Apache's AddCharset directive, or it
could use language-specific heuristics to detect the original character
encoding.  I don't know what the original poster's server does, but if you
understand Czech (I don't), the answer may be here:

http://www.csacek.cz/

-- 
Liam Quinn

Received on Wednesday, 4 September 2002 12:33:56 UTC