Re: Proposed addition to Display problems caused by the UTF-8 BOM

At 00:59 07/07/26, Addison Phillips wrote:

>So I would tend to replace the bit above thusly:
>
>--
>Some applications, such as text editors, look for the BOM as a signature indicating the use of a Unicode encoding. These applications, such as Windows Notepad, will automatically add a UTF-8 BOM to any file you save as UTF-8 so that they can detect it later. Browsers, however, don't look for the BOM and Web pages always need to declare the character encoding explicitly at the top of the file or in the HTTP header, making a BOM unnecessary (and, as noted above, sometimes harmful).
>--

I think this is a good direction, but I'm a bit worried by
"such as text editors". This implies that all or most text editors
silently add a BOM, which is not true. I would change
"such as text editors" to "such as some text editors".

Also, the "Browsers, however," is a bit of a problem, because
it's written as a counterpoint to editors. So I'd rewrite that
part a bit, too.

Regards,    Martin.


>Just a thought.
>
>Addison
>
>Richard Ishida wrote:
>> Chaps,
>> I propose to add the following paragraph to http://www.w3.org/International/questions/qa-utf8-bom in the section By the Way:
>> "Applications that look at the text to work out the 
>
>character encoding can tell straight away that the text is encoded in UTF-8 if they find a BOM at the beginning.
>
>This can save time if the only non-ASCII characters occur a long way down the file (such as a copyright symbol in text at the very end).  Web pages, however, ought to declare the character encoding explicitly at the top of the file or in the HTTP header, so a BOM should not be necessary."
>> Unless I hear any objections, I will make the change, unannounced, in a couple of days time.
>> Cheers,
>> RI
>> 
>> ============
>> Richard Ishida
>> Internationalization Lead
>> W3C (World Wide Web Consortium)
>>  
>> http://www.w3.org/People/Ishida/
>> http://www.w3.org/International/
>> http://people.w3.org/rishida/blog/
>> http://www.flickr.com/photos/ishida/
>>  
>> 
>
>
>
>Richard Ishida wrote:
>> Chaps,
>> I propose to add the following paragraph to http://www.w3.org/International/questions/qa-utf8-bom in the section By the Way:
>> "Applications that look at the text to work out the character encoding can tell straight away that the text is encoded in UTF-8 if they find a BOM at the beginning.  This can save time if the only non-ASCII characters occur a long way down the file (such as a copyright symbol in text at the very end).  Web pages, however, ought to declare the character encoding explicitly at the top of the file or in the HTTP header, so a BOM should not be necessary."
>> Unless I hear any objections, I will make the change, unannounced, in a couple of days time.
>> Cheers,
>> RI
>> 
>> ============
>> Richard Ishida
>> Internationalization Lead
>> W3C (World Wide Web Consortium)
>>  
>> http://www.w3.org/People/Ishida/
>> http://www.w3.org/International/
>> http://people.w3.org/rishida/blog/
>> http://www.flickr.com/photos/ishida/
>>  
>> 
>
>-- 
>Addison Phillips
>Globalization Architect -- Yahoo! Inc.
>Chair -- W3C Internationalization Core WG
>
>Internationalization is an architecture.
>It is not a feature.
>


#-#-#  Martin J. Du"rst, Assoc. Professor, Aoyama Gakuin University
#-#-#  http://www.sw.it.aoyama.ac.jp       mailto:duerst@it.aoyama.ac.jp     

Received on Friday, 27 July 2007 07:33:39 UTC