- From: Karl Dubost <karl@w3.org>
- Date: Mon, 13 Nov 2006 22:22:09 +0900
- To: Paul Arenson <paul@tokyoprogressive.org>
- Cc: public-evangelist@w3.org
Le 13 nov. 2006 à 10:50, Paul Arenson a écrit :
> UNSUCCESSFUL EXAMPLE (Looks ok on desktop but not on server)
> http://tokyoprogressive.org/why.html
>
> CODE
> <meta content="text/html; charset=UTF-8" http-equiv="content-type">
but this page is not in utf-8 but in shift-jis
Either you have to save your page as utf-8 or to change the encoding
information to
<META HTTP-EQUIV="Content-Type" CONTENT="text/html;charset=Shift_JIS">
> SUCCESSFUL EXAMPLE ONE (JAPANESE COMES OUT RIGHT)
> http://www.tokyoprogressive.org/index/weblog/print/april-entries/
Yes the page is rightly utf-8. not valid but utf-8
http://validator.w3.org/check?uri=http%3A%2F%
2Fwww.tokyoprogressive.org%2Findex%2Fweblog%2Fprint%2Fapril-entries%2F
> This was made via EXPRESSION ENGINE
>
> I note I have both xml: lang and uft-8.
xml:lang doesn't influence the display of the page. It is there for
example for triggering the right accent when passing the text through
a vocal browser. Or to help translation engines (not sure they
implement it though). Or to help spelling cheker to choose the right
dictionary.
I would recommend that you stick to utf-8, it would help to keep
consistency in the way you serve the pages.
A cool plug-in that could be develop and be added to LogValidator.
http://www.w3.org/QA/Tools/LogValidator/
Given a list of URIs, create a table with
uri server_encoding meta_encoding guessed_encoding
Someone on the list would like to do that?
http://www.w3.org/QA/Tools/LogValidator/Manual-Modules
> I THOUGHT I did this in UFT-8, but no.
> Mozilla even says it is UFT-8, but as you can see the code is
> western.
> In other words, why does it work?
because so browsers try to display wrong pages (invalid, wrong
encoding, etc.) then people who develop Web pages do not know that
they have done something wrong, and they do not fix it. IMHO it is a
mistake from browsers.
It is cool to try to recover and display the page, but it is wrong to
do silent recovery, as we do not enter in a cycle which help everyone
to fix things and have a better experience.
> SUCCESSUL EXAMPLE FOUR (most bizarre?)
> I even forgot to add the meta tag!!!
> http://tokyoprogressive.org/
The server is sending by default an information which has usually
priority other the information contained in the file.
The encoding in a file is a guess, and the browser _should_ follow
what the servers says.
> Make a page in several encodings
> http://tokyoprogressive.org/a.html
> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
> <html>
> <head>
> <meta content="text/html; charset=ISO-2022-JP"
> LOOKS OK ONLINE
doesn't look ok for me.
but your server is configured in a strange way
GET /a.html HTTP/1.1[CRLF]
Host: tokyoprogressive.org[CRLF]
Connection: close[CRLF]
Accept-Encoding: gzip[CRLF]
Accept: text/xml,application/xml,application/xhtml+xml,text/
html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5[CRLF]
Accept-Language:
fr,en;q=0.9,ja;q=0.9,de;q=0.8,es;q=0.7,it;q=0.7,nl;q=0.6,sv;q=0.5,nb;q=0
.5,da;q=0.4,fi;q=0.3,pt;q=0.3,zh-Hans;q=0.2,zh-Hant;q=0.1,ko;q=0.1[CRLF]
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7[CRLF]
User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv:
1.8.0.7) Gecko/20060911 Camino/1.0.3 Web-Sniffer/1.0.24[CRLF]
Referer: http://web-sniffer.net/[CRLF]
[CRLF]
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7[CRLF]
You serve first iso-8859-1 and then utf-8 and then anything. Maybe
one of the sources of your problems is there.
1. Change all your pages in one encoding only.
utf-8
2. Change the configuration of your server to send only utf-8.
--
Karl Dubost - http://www.w3.org/People/karl/
W3C Conformance Manager, QA Activity Lead
QA Weblog - http://www.w3.org/QA/
*** Be Strict To Be Cool ***
Received on Monday, 13 November 2006 13:22:39 UTC