- From: Albert Lunde <atlunde@panix.com>
- Date: Tue, 18 Dec 2012 12:39:43 -0600
- To: www International <www-international@w3.org>
"To communicate which byte order was in use, U+FEFF (the byte-order mark) was used at the start of the stream as magic number that is not logically part of the text the stream represents." I'd say .."as a magic number".. "You should also be aware that, although ASCII is a subset of UTF-8, a file that starts with a BOM is no longer ASCII-compatible." As I think was remarked on the list, the intended meaning of the phrase "ASCII-compatible" is not too obvious. I _think_ this refers to the (often desirable) property of UTF-8 that characters from the US-ASCII range are encoded in UTF-8 in a way that is byte-for-byte identical to US-ASCII encoding. I think it would be better to say that directly, somehow. For example: "UTF-8 without a BOM has the property that characters from the US-ASCII range are encoded byte-for-byte the same way as by the US-ASCII encoding. Adding a BOM inserts additional bytes, so this is no longer true."
Received on Tuesday, 18 December 2012 18:40:02 UTC