Re: Character encoding detection for external scripts from Bjoern Hoehrmann on 2009-09-03 (www-archive@w3.org from September 2009)

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Fri, 04 Sep 2009 00:06:45 +0200
To: "Anne van Kesteren" <annevk@opera.com>
Cc: www-archive@w3.org
Message-ID: <spe0a5lthb2iit951a90q2p85uqhmlmarv@hive.bjoern.hoehrmann.de>

* Anne van Kesteren wrote:
>On Sun, 30 Aug 2009 07:34:07 +0200, Bjoern Hoehrmann <derhoermi@gmx.net>  
>wrote:
>> [...]
>
>Any chance you can upload the page used for testing as well?

It's not really meant for public consumption, but I could send it to you
offlist if you like. If you are asking because of the bogus results from
Opera, looking at it again it seems I randomized the URLs to the HTML
host documents but not of the external scripts, so Opera caches them and
apparently caches the encoding it detected for them. So you get

  HTML in ISO-8859-2 + JS without label -> iso-8859-2 (put into cache)
  HTML in ISO-8859-5 + JS without label -> iso-8859-2 (got from cache)
  HTML in UTF-8      + JS without label -> iso-8859-2 (got from cache)

Problems of this kind are one of the reasons why it is so bad an idea to
make the encoding of one resource dependant on the encoding of another.
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/

Received on Thursday, 3 September 2009 22:07:31 UTC