Re: Language label from Drazen Kacar on 1997-02-28 (www-international@w3.org from January to March 1997)

From: Drazen Kacar <Drazen.Kacar@public.srce.hr>
Date: Fri, 28 Feb 1997 10:55:08 +0100 (MET)
To: carrasco@innet.lu
Cc: Drazen.Kacar@public.srce.hr, masinter@parc.xerox.com, misha.wolf@reuters.com, www-international@w3.org, unicode@unicode.org
Message-Id: <199702280955.KAA14208@jagor.srce.hr>

M.T. Carrasco Benitez wrote:
> > As far as I understand the situation, it would be nice to have that
> > information extracted and put in the header for the HEAD request.
> > I don't know which type of client could benefit from this, but perhaps
> > somebody else does.
> 
> I was thinking about robots:  A robot that only look for German docs, for
> example.  There are probably other applications.

Indexing engines are downloading the whole page. They have to make summary
and extract links to other documents. Spiders are doing HEAD requests
to find out if the documents were recently changed or deleted. They don't
need language information.

> > Yes, but...
> > There are enough web admins who don't want to know anything about their
> > servers. That means the users can't put charset info in the header.
> > META tag is the only thing that remains, at least for text/html.
> > Disclaimer: just describing current practice...
> 
> The charset must be in the HTTP header and inside the doc in
> <META HTTP-EQUIV=Content-Type ...>.  This META should be as near to the
> beginning of the doc as possible, as this is a catch-22 situation: how do
> the program reads the charset if it does not know the charset in the first
> place ?

Well, since text/html without charset parameter is Latin 1, anything before
META is Latin 1. Unless everything is what META says it is. :)

-- 
They work 24 hours a day and 256 days a year  --  root@fly.cc.fer.hr

dave@srce.hr
dave@fly.cc.fer.hr

Received on Friday, 28 February 1997 04:56:46 UTC