Re: Translated IUC10 Web pages: Experimental Results

On Wed, 5 Feb 1997, Misha Wolf wrote:

> I think it very unlikely that plain 16-bit Unicode will be adopted by 
> browsers in the next year or two.

Why not? It is more compact for East Asia (apart from the fact that
compression can be used anyway). I might understand if you would say
that it might not be adopted by content providers. But for browsers,
supporting UCS2/UTF-16 in addition to UTF-8 is an extremely small
addition, so I don't even see why there is discussion about it.


>The two encoding schemes which will 
> be widely used to encode Unicode Web pages are:
> 
>    1.  UTF-8 (see <http://www.reuters.com/unicode/iuc10/x-utf8.html>).
>    2.  Numeric Character References (see <http://www.reuters.com/unicode/iuc10/x-ncr.html>).
> 
> The second scheme is intriguing as it does not require the use of any 
> octets over 127 decimal (7F hex).  Accordingly, it is legal to to label 
> such a file as, eg, US-ASCII, ISO-8859-1, X-SJIS, or any other "charset" 
> which has ASCII as a subset.

It is not very harmful to label such pages ISO-8859-1 or whatever.
But strictly speaking, it is not legal! If there are alternatives
for labeling, the most restrictive label should be used. If it's
labeled us-ascii, you know that it's going to pass though 7-bit
mail. Otherwise, you don't.

I don't see that much of future popularity for purely NCR-coded
documents. These are more valuable for cases where you want to
add a character or two from a script not supported in the
local encoding used, e.g. a Kanji or two to a German document
or so.

Regards,	Martin.

Received on Wednesday, 5 February 1997 11:15:32 UTC