Re: Requesting a revision of RFC3023

From: MURATA Makoto <murata@hokkaido.email.ne.jp>
Date: Mon, 22 Sep 2003 00:37:18 +0900
To: Elliotte Rusty Harold <elharo@metalab.unc.edu>
Cc: ietf-xml-mime@imc.org, WWW-Tag <www-tag@w3.org>
Message-Id: <20030922002118.5076.MURATA@hokkaido.email.ne.jp>

> By Unicode signature, I'm guessing you mean the BOM? That problem 
> seems to have been easily dealt with by simply deciding to allow it 
> in UTF-8. It doesn't appear to have caused any problems in practice 
> today.

In the case of XML, I think you are right.  In general, however, see


> I don't know what you problems you refer to with "representation of 
> non-BMP characters". UTF-8 precisely specifies how these characters 
> are represented. There's no issue here. Did you mean something else?

Quite a few implementations use 6 bytes (rather than 4 bytes) to represent 
non-BMP characters.  See


MURATA Makoto <murata@hokkaido.email.ne.jp>
