Re: Comments on draft-yergeau-rfc2279bis-00.txt

Patrik Fältström wrote:

> What I hear on this list is that the consensus is that BOM SHOULD NOT be 
> used. I would like it to be MUST NOT be used in Internet protocols, 
> which leads to tagged UTF-8 text be illegal if the BOM exists in the text.


That would violate the Unicode standard. If UTF-8 is clearly indicated with some charset label, then an initial sequence of ef bb bf must be interpreted as the character U+feff ZWNBSP. Since that is not a very useful character at the beginning of a text, it can usually be ignored.

Personally, I find François' text very clear. It acknowledges existing, reasonable and useful practice.

Best regards,
markus


-- 
Opinions expressed here may not reflect my company's positions unless otherwise noted.

Received on Thursday, 17 October 2002 12:14:11 UTC