- From: Richard Ishida <ishida@w3.org>
- Date: Thu, 25 Sep 2003 11:43:28 +0100
- To: "'Tristan Nitot'" <tristan@nitot.com>, <public-evangelist@w3.org>
> -----Original Message----- > From: public-evangelist-request@w3.org > [mailto:public-evangelist-request@w3.org] On Behalf Of Tristan Nitot > Sent: 25 September 2003 07:51 > To: public-evangelist@w3.org > Subject: Re: [SumsaultRT #212] iso-8859-1 vs. utf-8 > > > > > > Mark Stosberg wrote: > > >>>provided that you serve them as text/html (but this applies > >>>trivially to application/xhtml+xml), you can add (or amend an > >>>existing entry) the following directive: > >>>AddType text/html;charset=iso-8859-1 html > >>> > >>> > > > >Thanks for the responses. So both "iso-8859-1" and "utf-8" have been > >recommended for us as the default character encoding type. > Is there a > >particular rule of thumb about which to use, or is the > central issue: > >"either one is better than none"? > > > > > each directive has good and bad side. > > UTF-8 is quite universal, but you'll have to use html > entities (such as > "é" for "é") instead of accented (non-ascii) characters. This > makes accented text painful to read and edit. > using 8859-1 allows you to insert accented characters in your html, > provided they belong to the latin-1 charset . This is the Hmm. I think you somehow have this the wrong way round. UTF-8 means you have no need to use character entities, since it covers the whole Unicode repertoire. As you say, its because ISO 8859-1 only covers characters used in Western European languages (plus a few other simple Latin langs), that if you wanted to represent, say, the Czech c caron, then you'd have to use a Numeric Character Reference or entity. For an example, see http://people.w3.org/rishida/scripts/samples/wrapping-no-dhtml.html (10 scripts on one page and no NCRs or entities other than &) You do, of course, need an editor that supports utf-8, but that's not difficult to find these days. You are right, though, when you say that it's better not to use character entities, but rather actual codes. > solution that > I use, because I write in French and in English only. If I > wanted to use > characters outside of the iso-8859-1 charset, (e.g. for > czech) I'd have > to use html entities or switch to UTF-8. > > > --Tristan > PS: I'm not an I18n guru, so feel free to correct me if I'm wrong. > > -- > Contributeur Mozilla et OpenWebGroup > http://mozilla.org/ : Efficiency, safety and liberty for browsing. > http://openweb.eu.org/ : pour apprendre les standards. > http://standblog.com/ : un blog sur les standards. > http://pompage.net/ : de saines lectures à propos des standards. ============ Richard Ishida W3C contact info: http://www.w3.org/People/Ishida/ http://www.w3.org/International/ http://www.w3.org/International/geo/ See the W3C Internationalization FAQ page http://www.w3.org/International/questions.html
Received on Thursday, 25 September 2003 06:44:09 UTC