RE: [SumsaultRT #212] iso-8859-1 vs. utf-8

Yep.  I think I alluded to this further down in my message.   However,
I'd like to encourage the mode of thought that it's a much better plan
to try and find a utf-8 capable editor than to just fall back on the
entities.

In my experience, many English speakers don't think this is a big deal,
but as Tristan said, it can seriously affect readability and
maintainability of the source in a language like French (not to mention
Chinese or Russian, or even Czech [see below]). As Tristan said using
é is to be avoided if at all possible. 

RI


Here's an example of Czech text where accented characters use NCRs.
It's almost impossible to read.

Jako efektivnĕjší se nám jeví
pořádání tzv. Road Show prostřednictvím
našich autorizovanǽch dealerů v Čechách a
na Moravě, které proběhnou v průbůhu
zá ří a října.



============
Richard Ishida
W3C

contact info: http://www.w3.org/People/Ishida/ 

http://www.w3.org/International/ 
http://www.w3.org/International/geo/ 

See the W3C Internationalization FAQ page
http://www.w3.org/International/questions.html



> -----Original Message-----
> From: Karl Dubost [mailto:karl@la-grange.net] 
> Sent: 25 September 2003 16:58
> To: ishida@w3.org
> Cc: public-evangelist@w3.org; 'Tristan Nitot'
> Subject: Re: [SumsaultRT #212] iso-8859-1 vs. utf-8
> 
> 
> 
> Le jeudi, 25 sep 2003, à 06:43 America/Montreal, Richard 
> Ishida a écrit 
> :
> >>
> >> UTF-8 is quite universal, but you'll have to use html 
> entities (such 
> >> as "é" for "é") instead of accented (non-ascii) characters. 
> >> This
> 
> >
> > Hmm.  I think you somehow have this the wrong way round.  
> UTF-8 means 
> > you have no need to use character entities, since it covers 
> the whole 
> > Unicode repertoire.  As you say, its because ISO 8859-1 only covers
> 
> :) let's clear up a bit. Both of you are right, in some context.
> 
> * If you have an editor (authoring tool) which can NOT input utf-8 in 
> your text and you still want to use utf-8 for your document. You can 
> use this low tech method which is
> 	é -> é for example, so you will have only 
> us-ascii characters 
> in your document and us-ascii is a subset of utf-8.
> 
> * If you have an editor which can input utf-8. Just type your accents.
> 
> 
> BTW, it would be good that someone on the mailing-list makes 
> a list of 
> all editors and their support of utf-8.
> 
> 
> 
> 

Received on Thursday, 25 September 2003 12:30:34 UTC