- From: Jungshik Shin <jshin@i18nl10n.com>
- Date: Thu, 6 Nov 2003 00:41:39 +0900 (KST)
- To: Tex Texin <tex@i18nguy.com>
- Cc: public-i18n-geo@w3.org
On Wed, 5 Nov 2003, Tex Texin wrote:
> 2) yes, the characters will display differently, depending on encoding and font
> of the editor.
> Maybe we should use a graphic to show the mistreatment(s).
> Also, they are being mistreated as characters, but we should refer to them as
> bytes since they are not representing characters.
Yeah, it may be a good idea to give a couple (or a few) images with
misidentified encodings.
> 3) For the faq we shouldn't use scripts that look "something like..." or have
> too many version dependencies. So we can't use the sed script.
I should have given a complete version, but I guess not many people
would be interested in it. So, let's forget about it as you suggested.
> If it is not safe and reliable we shouldn't put it in the faq at all.
With Perl 5.8 already widely deployed, the following
is safe and reliable. I wrote about Perl 5.6 (or earlier) just for the
sake of completeness.
perl -p -i.bak -e '(1 == $.) && s/^\x{FEFF}//' filename
With '-i.bak' instead of '-i', the original will be backed up
as filename.bak (Martin used '-i~'.)
Jungshik
Received on Wednesday, 5 November 2003 10:44:54 UTC