- From: Martin J. Duerst <mduerst@ifi.unizh.ch>
- Date: Wed, 5 Feb 1997 17:14:38 +0100 (MET)
- To: Misha Wolf <misha.wolf@reuters.com>
- cc: Unicore <unicore@unicode.org>, Unicode <unicode@unicode.org>, www-international <www-international@w3.org>, Search <search@mccmedia.com>, ISO10646 <iso10646@listproc.hcf.jhu.edu>, http-wg@cuckoo.hpl.hp.com
On Wed, 5 Feb 1997, Misha Wolf wrote: > I think it very unlikely that plain 16-bit Unicode will be adopted by > browsers in the next year or two. Why not? It is more compact for East Asia (apart from the fact that compression can be used anyway). I might understand if you would say that it might not be adopted by content providers. But for browsers, supporting UCS2/UTF-16 in addition to UTF-8 is an extremely small addition, so I don't even see why there is discussion about it. >The two encoding schemes which will > be widely used to encode Unicode Web pages are: > > 1. UTF-8 (see <http://www.reuters.com/unicode/iuc10/x-utf8.html>). > 2. Numeric Character References (see <http://www.reuters.com/unicode/iuc10/x-ncr.html>). > > The second scheme is intriguing as it does not require the use of any > octets over 127 decimal (7F hex). Accordingly, it is legal to to label > such a file as, eg, US-ASCII, ISO-8859-1, X-SJIS, or any other "charset" > which has ASCII as a subset. It is not very harmful to label such pages ISO-8859-1 or whatever. But strictly speaking, it is not legal! If there are alternatives for labeling, the most restrictive label should be used. If it's labeled us-ascii, you know that it's going to pass though 7-bit mail. Otherwise, you don't. I don't see that much of future popularity for purely NCR-coded documents. These are more valuable for cases where you want to add a character or two from a script not supported in the local encoding used, e.g. a Kanji or two to a German document or so. Regards, Martin.
Received on Wednesday, 5 February 1997 11:15:32 UTC