Re: Strange advice re BOM and UTF-8

Hi Chris,

On Dec 6, 2006, at 23:35 , Chris Lilley wrote:
> I was surprised to see, on the W3C DTD validator, the following  
> advice:
>
>   The Unicode Byte-Order Mark (BOM) in UTF-8 encoded files is known to
>   cause problems for some text editors and older browsers. You may
>   want to consider avoiding its use until it is better supported.
>
> This is odd because the use of a BOM with UTF-8 files is
>
> a) standards compliant, to Unicode and to XML and to CSS
> b) common practice
> c) allows text editors to auto-detect the encoding of a plain text
> document.
>
> I believe therefore that the advice is incorrect and indeed
> potentially damaging.

I am not an expert so all my knowledge about UTF-8 with BOM comes  
from hearsay and some documentation I have read, and the picture I  
was having so far was pointing toward the fact that the BOM for utf-8  
was not very necessary (it is only a signature, not a mention of byte  
order, isn't it?), and indeed sometimes (although perhaps more and  
more rarely) harmful because of implementations that do not  
understand the mark.

Docs I know include:
http://www.w3.org/International/questions/qa-utf8-bom
http://unicode.org/unicode/faq/utf_bom.html#BOM
and both seem to point towards a cautious usage of a BOM for utf-8,  
or no usage at all

Do you have other references worth reading on the topic?

Thank you.

-- 
olivier

Received on Wednesday, 6 December 2006 15:09:45 UTC