RE: [SumsaultRT #212] iso-8859-1 vs. utf-8

There are similar utilities for Windows users.  Uniconv is one that
comes to mind.

There's also a simple trick that tends to be easily available to most
Windows users but in my experience not well known: use MS Word to read
in the file as **encoded text** (not Word format !!), then save it back
out again. You are offered a very large choice of encodings to save it
in, one of which is utf-8.

Note that you can do a similar thing with Notepad on XP, but it's not a
good idea since it adds a byte order mark to the beginning of a utf-8
file which then causes problems in some browsers.

(Of course, it's much easier just to edit in or save out as utf-8 from
your editor ;)

hth
RI

============
Richard Ishida
W3C

contact info: http://www.w3.org/People/Ishida/ 

http://www.w3.org/International/ 
http://www.w3.org/International/geo/ 

See the W3C Internationalization FAQ page
http://www.w3.org/International/questions.html



> -----Original Message-----
> From: public-evangelist-request@w3.org 
> [mailto:public-evangelist-request@w3.org] On Behalf Of Karl Dubost
> Sent: 25 September 2003 18:39
> To: public-evangelist@w3.org
> Subject: Re: [SumsaultRT #212] iso-8859-1 vs. utf-8
> 
> 
> 
> 
> Le jeudi, 25 sep 2003, à 12:29 America/Montreal, Richard 
> Ishida a écrit 
> :
> > Yep.  I think I alluded to this further down in my message. 
>   However,
> > I'd like to encourage the mode of thought that it's a much 
> better plan 
> > to try and find a utf-8 capable editor than to just fall 
> back on the 
> > entities.
> 
> Another solution:
> 
> * iconv
> iconv is a small unix program (works on Linux, BSD, Mac OS X) 
> http://www.gnu.org/software/libiconv/
> 
http://www.gnu.org/software/libiconv/documentation/libiconv/iconv.1.html

For example to convert from iso-8859-1 to utf-8

karl% iconv -f ISO-8859-1 -t UTF-8 fileold.html > filenew.html

If you want to convert a lot of files, create a small scripts like that.

==================
#!/bin/sh
#
# Shell Script
# to convert a file from Latin 1 to utf-8

WEBDIR=`find  /home/mywebsites/files/ -name '*.html'`

echo "********** Converting File *******"
for i in $WEBDIR;
do
     iconv -f ISO-8859-1 -t UTF-8 $i > $i.new
     mv $i.new $i
done
==================

--
Karl Dubost - http://www.w3.org/People/karl/
W3C Conformance Manager
*** Be Strict To Be Cool ***

Received on Thursday, 25 September 2003 13:55:24 UTC