MORE TESTING

I do not think it is the server, because I just took two more files.   
one was created before and called testz.  The other i created now in  
Mozilla using UFT-8.
Called testzz,  I uploaded both to two different servers and both  
came out wrong.


Is my Mozilla corrupted?

http://tokyoprogressive.org/testz.html
http://tokyoprogressive.org.uk/testz.html

http://tokyoprogressive.org/testzz.html
http://tokyoprogressive.org.uk/testzz.html


Going to bed, it is midnight here.  Good night, and thanks.


__/__/__/__/__/__/__/__/__/__/
Paul Arenson

EMAIL
paul@tokyoprogressive.org

PHONE &VOICE MAIL
1-617-379-0761 (U.S.)
090-4173-3873 (Japan)
paularenson (Skype)
__/__/__/__/__/__/__/__/__/__/





On Nov 13, 2006, at 11:40 PM, Greg Swaney wrote:

> I did a lot of poking and changing character sets on your account  
> on sunday and it never showed the characters how they were supposed  
> to be shown. What did w3 say?
>
> Paul Arenson wrote:
>> Hi Greg
>> Further to my Sunday post about files I create in various  
>> encodings using Mozilla looking ok on my desktop but not on the  
>> server, I wrote to w3.org and they advised me, but it is way over  
>> my head.
>> What i am guessing is that files created by Expression Engine  
>> output in unicode (UFT-8) and somehow something on the server  
>> (database?)
>> tells the server to do something to the encoding.  Anyway, when I  
>> create a uft encoding on my desktop, it is served different on the  
>> site.....
>> I still use Expression Engine, but also use my own pages.
>> Maybe I should contact the guy who set up expression engine for me?
>> I am totally lost....though perhaps it is simple?
>> Thanks!
>> paul
>> see below from the web person--> public-evangelist@w3.org  
>> <mailto:public-evangelist@w3.org>
>> thanks
>> __/__/__/__/__/__/__/__/__/__/
>> Paul Arenson
>> EMAIL
>> paul@tokyoprogressive.org <mailto:paul@tokyoprogressive.org>
>> PHONE &VOICE MAIL
>> 1-617-379-0761 (U.S.)
>> 090-4173-3873 (Japan)
>> paularenson (Skype)
>> __/__/__/__/__/__/__/__/__/__/
>> Begin forwarded message:
>>> *Resent-From: *public-evangelist@w3.org <mailto:public- 
>>> evangelist@w3.org>
>>> *From: *Karl Dubost <karl@w3.org <mailto:karl@w3.org>>
>>> *Date: *November 13, 2006 10:22:09 PM JST
>>> *To: *Paul Arenson <paul@tokyoprogressive.org  
>>> <mailto:paul@tokyoprogressive.org>>
>>> *Cc: *public-evangelist@w3.org <mailto:public-evangelist@w3.org>
>>> *Subject: **Re: japanese encoding nightmare*
>>>
>>>
>>>
>>> Le 13 nov. 2006 à 10:50, Paul Arenson a écrit :
>>>> UNSUCCESSFUL EXAMPLE (Looks ok on desktop but not on server)
>>>> http://tokyoprogressive.org/why.html
>>>>
>>>> CODE
>>>>  <meta content="text/html; charset=UTF-8" http-equiv="content- 
>>>> type">
>>>
>>> but this page is not in utf-8 but in shift-jis
>>>
>>> Either you have to save your page as utf-8 or to change the  
>>> encoding information to
>>> <META HTTP-EQUIV="Content-Type" CONTENT="text/html;">
>>>
>>>
>>>> SUCCESSFUL EXAMPLE ONE (JAPANESE COMES OUT RIGHT)
>>>> http://www.tokyoprogressive.org/index/weblog/print/april-entries/
>>>
>>> Yes the page is rightly utf-8. not valid but utf-8
>>> http://validator.w3.org/check?uri=http%3A%2F% 
>>> 2Fwww.tokyoprogressive.org%2Findex%2Fweblog%2Fprint%2Fapril- 
>>> entries%2F
>>>
>>>> This was made via EXPRESSION ENGINE
>>>>
>>>> I note I have both  xml: lang and  uft-8.
>>>
>>> xml:lang doesn't influence the display of the page. It is there  
>>> for example for triggering the right accent when passing the text  
>>> through a vocal browser. Or to help translation engines (not sure  
>>> they implement it though). Or to help spelling cheker to choose  
>>> the right dictionary.
>>>
>>> I would recommend that you stick to utf-8, it would help to keep  
>>> consistency in the way you serve the pages.
>>>
>>> A cool plug-in that could be develop and be added to LogValidator.
>>> http://www.w3.org/QA/Tools/LogValidator/
>>>
>>> Given a list of URIs, create a table with
>>> uri   server_encoding   meta_encoding    guessed_encoding
>>>
>>> Someone on the list would like to do that?
>>> http://www.w3.org/QA/Tools/LogValidator/Manual-Modules
>>>
>>>
>>>
>>>> I THOUGHT I did  this in UFT-8, but no.
>>>>  Mozilla even says it is UFT-8, but as you can see the code is  
>>>> western.
>>>> In other words, why does it work?
>>>
>>> because so browsers try to display wrong pages (invalid, wrong  
>>> encoding, etc.) then people who develop Web pages do not know  
>>> that they have done something wrong, and they do not fix it. IMHO  
>>> it is a mistake from browsers.
>>> It is cool to try to recover and display the page, but it is  
>>> wrong to do silent recovery, as we do not enter in a cycle which  
>>> help everyone to fix things and have a better experience.
>>>
>>>> SUCCESSUL EXAMPLE FOUR (most bizarre?)
>>>> I even forgot to add the meta tag!!!
>>>> http://tokyoprogressive.org/
>>>
>>> The server is sending by default an information which has usually  
>>> priority other the information contained in the file.
>>> The encoding in a file is a guess, and the browser _should_  
>>> follow what the servers says.
>>>
>>>
>>>> Make a page in several  encodings
>>>> http://tokyoprogressive.org/a.html
>>>> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
>>>> <html>
>>>> <head>
>>>>   <meta content="text/html; charset=ISO-2022-JP"
>>>> LOOKS OK ONLINE
>>>
>>> doesn't look ok for me.
>>>
>>> but your server is configured in a strange way
>>>
>>> GET /a.html HTTP/1.1[CRLF]
>>> Host: tokyoprogressive.org[CRLF]
>>> Connection: close[CRLF]
>>> Accept-Encoding: gzip[CRLF]
>>> Accept: text/xml,application/xml,application/xhtml+xml,text/ 
>>> html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5[CRLF]
>>> Accept-Language:  
>>> fr,en;q=0.9,ja;q=0.9,de;q=0.8,es;q=0.7,it;q=0.7,nl;q=0.6,sv;q=0.5,nb 
>>> ;q=0.5,da;q=0.4,fi;q=0.3,pt;q=0.3,zh-Hans;q=0.2,zh- 
>>> Hant;q=0.1,ko;q=0.1[CRLF]
>>> Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7[CRLF]
>>> User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-US; rv: 
>>> 1.8.0.7) Gecko/20060911 Camino/1.0.3 Web-Sniffer/1.0.24[CRLF]
>>> Referer: http://web-sniffer.net/[CRLF]
>>> [CRLF]
>>>
>>>
>>> Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7[CRLF]
>>>
>>> You serve first iso-8859-1 and then utf-8 and then anything.  
>>> Maybe one of the sources of your problems is there.
>>>
>>> 1. Change all your pages in one encoding only.
>>> utf-8
>>> 2. Change the configuration of your server to send only utf-8.
>>>
>>>
>>>
>>>
>>>
>>> -- 
>>> Karl Dubost - http://www.w3.org/People/karl/
>>> W3C Conformance Manager, QA Activity Lead
>>>   QA Weblog - http://www.w3.org/QA/
>>>      *** Be Strict To Be Cool ***
>>>
>>>
>>>
>>>
>
> -- 
> Greg Swaney
> NEXCESS.NET Internet Solutions
> http://nexcess.net
> 304 1/2 S. State St.
> Ann Arbor, MI 48104
> 1.866.NEXCESS

Received on Monday, 13 November 2006 15:06:35 UTC