W3C home > Mailing lists > Public > public-i18n-geo@w3.org > November 2003

Re: New FAQ: Removing UTF-8 BOM

From: Martin Duerst <duerst@w3.org>
Date: Thu, 06 Nov 2003 01:46:29 -0500
Message-Id: <4.2.0.58.J.20031106014325.05944348@localhost>
To: Tex Texin <tex@i18nguy.com>, Jungshik Shin <jshin@i18nl10n.com>
Cc: Deborah Cawkwell <deborah.cawkwell@bbc.co.uk>, public-i18n-geo@w3.org

At 09:56 03/11/05 -0500, Tex Texin wrote:

>Hi Jungshik,
>
>1) yes, utf-16 is pairs of bytes, utf-32 is quadruplets.
>2) yes, the characters will display differently, depending on encoding and 
>font
>of the editor.
>Maybe we should use a graphic to show the mistreatment(s).

Yes, I think a screenshot or two would be good.


>3) For the faq we shouldn't use scripts that look "something like..." or have
>too many version dependencies. So we can't use the sed script.
>Also, thanks for pointing out the problem with the perl script in your other
>mail.
>If it is not safe and reliable we shouldn't put it in the faq at all.

I agree that version dependencies are a bad idea, but we should
try our best to make sure we have a script. A lot of people copy/paste
and use scripts, but they won't write a script by themselves.

On more comment (for the original FAQ): I think the background section
should be shorter, or be moved to after the answer. People want to
see the answer to their question quickly.

Regards,   Martin.
Received on Thursday, 6 November 2003 06:57:02 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:12:38 GMT