Re: For review: The byte-order mark (BOM) in HTML

"To communicate which byte order was in use, U+FEFF (the byte-order 
mark) was used at the start of the stream as magic number that is not 
logically part of the text the stream represents."

I'd say .."as a magic number"..


  "You should also be aware that, although ASCII is a subset of UTF-8, a 
file that starts with a BOM is no longer ASCII-compatible."

As I think was remarked on the list, the intended meaning of the phrase 
"ASCII-compatible" is not too obvious.

I _think_ this refers to the (often desirable) property of UTF-8 that 
characters from the US-ASCII range are encoded in UTF-8 in a way that is 
byte-for-byte identical to US-ASCII encoding. I think it would be better 
to say that directly, somehow.

For example:

"UTF-8 without a BOM has the property that characters from the US-ASCII 
range are encoded byte-for-byte the same way as by the US-ASCII 
encoding. Adding a BOM inserts additional bytes, so this is no longer true."

Received on Tuesday, 18 December 2012 18:40:02 UTC