- From: MURATA Makoto <murata@hokkaido.email.ne.jp>
- Date: Mon, 22 Sep 2003 00:37:18 +0900
- To: Elliotte Rusty Harold <elharo@metalab.unc.edu>
- Cc: ietf-xml-mime@imc.org, WWW-Tag <www-tag@w3.org>
> By Unicode signature, I'm guessing you mean the BOM? That problem > seems to have been easily dealt with by simply deciding to allow it > in UTF-8. It doesn't appear to have caused any problems in practice > today. In the case of XML, I think you are right. In general, however, see http://www.ietf.org/internet-drafts/draft-yergeau-rfc2279bis-05.txt > I don't know what you problems you refer to with "representation of > non-BMP characters". UTF-8 precisely specifies how these characters > are represented. There's no issue here. Did you mean something else? Quite a few implementations use 6 bytes (rather than 4 bytes) to represent non-BMP characters. See http://www.unicode.org/reports/tr26/ -- MURATA Makoto <murata@hokkaido.email.ne.jp>
Received on Sunday, 21 September 2003 11:42:10 UTC