Re: For review: The byte-order mark (BOM) in HTML from John Cowan on 2012-12-20 (www-international@w3.org from October to December 2012)

From: John Cowan <cowan@mercury.ccil.org>
Date: Wed, 19 Dec 2012 23:13:26 -0500
To: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
Cc: Albert Lunde <atlunde@panix.com>, www International <www-international@w3.org>
Message-ID: <20121220041326.GC21024@mercury.ccil.org>

Leif Halvard Silli scripsit:

> It seems impossible to improve the text unless Richard clarifies what 
> use the text has in mind. 

I agree that it's wrongly worded, but I believe the intent is clear.
Here's my revision:

"The UTF-8 encoding without a BOM has the property that a document
which contains only characters from the US-ASCII range is encoded
byte-for-byte the same way as the same document encoded using the
US-ASCII encoding.  Such a document can be processed either as UTF-8 or
as US-ASCII.  Adding a BOM inserts additional non-ASCII bytes, so this
is no longer true."

I believe that statement is correct, complete, and useful.

-- 
You annoy me, Rattray!  You disgust me!         John Cowan
You irritate me unspeakably!  Thank Heaven,     cowan@ccil.org
I am a man of equable temper, or I should       http://www.ccil.org/~cowan
scarcely be able to contain myself before
your mocking visage.            --Stalky imitating Macrea

Received on Thursday, 20 December 2012 04:13:49 UTC