- From: Martin J. Duerst <mduerst@ifi.unizh.ch>
- Date: Thu, 6 Feb 1997 12:26:26 +0100 (MET)
- To: Misha Wolf <misha.wolf@reuters.com>
- cc: Rob Pike <rob@plan9.bell-labs.com>, Unicode <unicode@unicode.org>, www-international <www-international@w3.org>
On Wed, 5 Feb 1997, Misha Wolf wrote: > Rob Pike wrote: > >I believe what you're supposed > >to say is charset=UNICODE-1-1-UTF-8. > > Both MS and NS have recently moved to "UTF-8". Rob - Maybe you are assuming that UTF-8 is a general method to encode 4-byte quantities. This is not the case. UTF stands for UCS transfer (or transform or whatever) format. And UCS is the Universal Character Set, aka UNicode/ISO 10646. Also please note that RFC 2044 defines the "charset" tag UTF-8. However, there is one problem in that draft (due to the slow RFC process last year), namely that RFC 2044 is written relative to Unicode 1.1, whereas everyone agrees that "UTF-8" indeed should be used for Unicode 2.0 and upwards. Regards, Martin.
Received on Thursday, 6 February 1997 06:26:10 UTC