- From: Irene Vatton <Irene.Vatton@inrialpes.fr>
- Date: Fri, 17 Feb 2006 17:25:59 +0100
- To: Michael Teichgräber <mt@wmipf.in-berlin.de>
- Cc: www-amaya@w3.org
On Wednesday 15 February 2006 14:20, Michael Teichgräber wrote: > Hi, > > when loading some html files, just converted to utf-8, into Amaya I > realized that non-ascii characters were displayed broken (i.e. as if > watched in iso-8859-1 mode), and the Document Information window said > "Charset: Unknown". > > The pages were sent to Amaya by a CGI program with a Content-Type of > "text/html; charset=utf-8", so I wondered what could be wrong. > > It seems to be caused by the existence of another header > "Cache-Control: no-cache", that is sent by the CGI program too. If > that header is present, the charset information seems to be ignored, > even if Amaya's cache is disabled. If, however, I leave this header > out, Amaya recognizes the character coding as utf-8, and the display > is fine. > > I suppose the charset info shouldn't be discarded in that case. > Perhaps it is not available anymore after the `no-cache' information > has been processed? Both Amaya 8.5 and 9.4 show this behaviour. > > Regards, > > Michael Amaya implements the standard policy: 1. Charset given in HTTP header 2. If not defined Charset given in xml declaration 3. If not defined Charset given in meta 4. If not defined use default Charset ( iso-8859-1 for HTML documents, utf-8 for XHTML documents). If the document is declared XHTML and is served with text/html, the result could depend on the Web client. We know that some browsers try to guess the real document encoding. Amaya doesn't do that, as its goal is to help people to generate correct pages. -- Irène. ----- Irène Vatton INRIA Rhône-Alpes INRIA ZIRST e-mail: Irene.Vatton@inria.fr 655 avenue de l'Europe Tel.: +33 4 76 61 53 61 Montbonnot Fax: +33 4 76 61 52 07 38334 Saint Ismier Cedex - France
Received on Friday, 17 February 2006 16:27:26 UTC