Re: Unicode conference papers from Martin Duerst on 2006-11-27 (www-international@w3.org from October to December 2006)

From: Martin Duerst <duerst@it.aoyama.ac.jp>
Date: Mon, 27 Nov 2006 16:15:16 +0900
To: "Jukka K. Korpela" <jkorpela@cs.tut.fi>, "'Unicode'" <unicode@unicode.org>
Cc: www-international@w3.org
Message-Id: <6.0.0.20.2.20061127160931.05b51500@localhost>

Hello Jukka,

Many thanks for your detailled checks. The page has been fixed
in the meantime, but some comments below. 

At 17:17 06/11/23, Jukka K. Korpela wrote:
>On Wed, 22 Nov 2006, Martin Duerst wrote:

>> Yes. The W3C site has quite a lot of these, too, even if they are
>> fortunately usually limited to single characters such as the copyright
>> sign. Here's an example:
>> http://www.w3.org/2001/Annotea/User/Papers.html
>
>That page is a somewhat different case. There's more than the copyright sign that is wrong there, namely the registered sign and two occurrences of e with acute (in the name "Jos$Bq(B), too. Moreover, the page says
>   <?xml version="1.0" encoding="iso-8859-1"?>
>_and_
>   <meta http-equiv="content-type"
>   content="application/xhtml+xml; charset=UTF-8" />
>but what really matters is the HTTP header
>   Content-Type: text/html; charset=iso-8859-1
>
>If you manually change the encoding used by a browser to UTF-8, the $Bq(Bs become right and the two other non-ASCII characters become a little less
>obscured by extra characters before them. There _is_ a "double UTF-8" involved, too, but the primary problem is that the declared encoding
>is not the one actually used on the page.

Well, put in other words, that page is on it's way to more "double UTF-8"
encoding. It gets downloaded as iso-8859-1 and uploaded as utf-8. Every time
that's done, potentially another "double UTF-8" is added (or to be precise,
we move from "double UTF-8" to "triple UTF-8" and so on). If different parts
have been added at different stages, then they will be more or less overencoded.

Regards,    Martin.

#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp

Received on Monday, 27 November 2006 23:03:32 UTC