W3C home > Mailing lists > Public > ietf-charsets@w3.org > October to December 2002

Re: Comments on draft-yergeau-rfc2279bis-00.txt(B

From: Martin Duerst(B <duerst@w3.org>
Date: Fri, 18 Oct 2002 09:48:07 +0900
To: Markus Scherer <markus.scherer@jtcsv.com>, charsets <ietf-charsets@iana.org> (B
Message-id: <4.2.0.58.J.20021018094719.04d58450@localhost>

At 09:12 02/10/17 -0700, Markus Scherer wrote:
>Patrik F$BgM(Btstr$B‹N(B wrote:
>
>>What I hear on this list is that the consensus is that BOM SHOULD NOT be 
>>used. I would like it to be MUST NOT be used in Internet protocols, which 
>>leads to tagged UTF-8 text be illegal if the BOM exists in the text.
>
>
>That would violate the Unicode standard.

Hello Markus,

Can you give the details of why and how (in terms e.g. of conformance
clauses in the Unicode Standard)?

Regards,     Martin.


>If UTF-8 is clearly indicated with some charset label, then an initial 
>sequence of ef bb bf must be interpreted as the character U+feff ZWNBSP. 
>Since that is not a very useful character at the beginning of a text, it 
>can usually be ignored.
>
>Personally, I find Fran$BmP(Bis' text very clear. It acknowledges existing, 
>reasonable and useful practice.
>
>Best regards,
>markus
>
>
>--
>Opinions expressed here may not reflect my company's positions unless 
>otherwise noted.
>
Received on Thursday, 17 October 2002 21:46:47 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 5 June 2006 15:10:54 GMT