W3C home > Mailing lists > Public > www-international@w3.org > July to September 2003

RE: Re: Mojibake on my Web pages

From: Richard Ishida <ishida@w3.org>
Date: Tue, 30 Sep 2003 08:28:17 +0100
To: "'Asmus Freytag'" <asmusf@ix.netcom.com>, <www-international@w3.org>
Cc: <dewell@adelphia.net>
Message-ID: <001e01c38724$6aedd610$6401a8c0@w3c40upc3ma3j2>

Doug, I'm wondering whether your company uses Apache servers and just
upgraded them.  Apparently the latest upgrade serves all files by
default as ISO 8859-1 unless otherwise specified - which may account for
your mojibake.

I'm no expert on these matters, but I believe that on an Apache server
one can set the default to utf-8 for all files.  Alternatively, you can
serve files in a given directory and its subdirectories as utf-8 (or any
other encoding) by adding a file called .htaccess the directory in
question.  The .htaccess file should contain

AddType 'text/html; charset=UTF-8' html

This says, serve all files with an html ending as text/html with the
encoding utf-8.

You can do this on a file by file basis too. For example, if there was a
file in the directory you wanted to serve as iso-8859-1, you could add
the following to the .htaccess file.

<Files ~ "Overview\.html">
ForceType 'text/html; charset=ISO-8859-1'

For additional information, and other servers, see

Hope this is of some help,

Richard Ishida

contact info: http://www.w3.org/People/Ishida/ 


See the W3C Internationalization FAQ page

> -----Original Message-----
> From: www-international-request@w3.org 
> [mailto:www-international-request@w3.org] On Behalf Of Asmus Freytag
> Sent: 24 September 2003 18:19
> To: www-international@w3.org
> Subject: Fwd: Re: Mojibake on my Web pages
> This issue has been raised on the unicode@unicode.org list.
> A./
> >From: "Doug Ewell" <dewell@adelphia.net>
> >To: "Unicode Mailing List" <unicode@unicode.org>
> >Subject: Re: Mojibake on my Web pages
> >Date: Wed, 24 Sep 2003 08:32:42 -0700
> >
> >Stefan Persson <alsjebegrijptwatikbedoel at yahoo dot se> wrote:
> >
> > > Is there no way to force the browsers to use the encoding as 
> > > specified in the documents instead of that specified by 
> the server?  
> > > I'm having this problem myself with a different server, and would 
> > > like to find a solution to it.
> >
> >I can always visit View | Encoding and change the setting to 
> UTF-8 on a 
> >one-time basis.  But as soon as the page is refreshed, it reverts to 
> >whatever the server specifies.
> >
> >I don't know if there's a way to teach IE that a given URL should
> >*always* be overridden to UTF-8, but even if there was, that 
> would only 
> >help me and those who know the secret.  It should work for everybody.
> >
> > > It is very irritating that the HTTP header overrules the 
> <meta> tag, 
> > > since it seems that the error is more often in the HTTP 
> header than 
> > > in the <meta> tag.
> >
> >Indeed.  You'd think if the author (or software) included a 
> <meta> tag 
> >AND an explicit declaration in the XML header, he (or it) 
> knew what he 
> >(or it) was doing and the tag(s) should be honored.
> >
> >Apologies to the list if this is getting OT.
> >
> >-Doug Ewell
> >  Fullerton, California
> >  http://users.adelphia.net/~dewell/
> ---------------------------------------------
> >To: "Unicode Mailing List" <unicode@unicode.org>
> >Subject: Mojibake on my Web pages
> >
> >Apologies in advance to anyone who visits my Web site and 
> sees garbage 
> >characters, a.k.a. "mojibake."  It isn't my fault.
> >
> >Adelphia is currently having a character-set problem with their HTTP 
> >servers.  Apparently they are serving all pages as ISO 
> 8859-1 even if 
> >they are marked as being encoded in another character set, such as 
> >UTF-8.  So, instead of seeing U+2022 BULLET on my page, for example, 
> >you'll see:
> >
> >     •
> >
> >If you manually change the encoding in your browser to UTF-8, or 
> >download the page and display it as a local file, everything 
> looks fine 
> >because Adelphia's server is no longer calling the shot.  Their tech 
> >support people acknowledge that the problem is at their end and said 
> >they would look into it.
> >
> >I understand that having the "Unicode Encoded" logo on my 
> page next to 
> >these garbage characters may not reflect well on Unicode, 
> especially to 
> >newbies.  I'm considering putting a disclaimer at the top of 
> my pages, 
> >but I'm waiting to see how quickly they solve the problem.
> >
> >-Doug Ewell
> >  Fullerton, California
> >  http://users.adelphia.net/~dewell/
Received on Tuesday, 30 September 2003 03:28:40 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 21 September 2016 22:37:23 UTC