Re: revised "generic syntax" internet draft

Harald.T.Alvestrand@uninett.no
Wed, 16 Apr 1997 10:35:16 +0200


From: Harald.T.Alvestrand@uninett.no
To: John C Klensin <klensin@mci.net>
Cc: Dan Oscarsson <Dan.Oscarsson@trab.se>, uri@bunyip.com,
Subject: Re: revised "generic syntax" internet draft 
In-Reply-To: Your message of "Tue, 15 Apr 1997 11:55:43 EDT."
             <SIMEON.9704151143.E@tp7.Jck.com> 
Date: Wed, 16 Apr 1997 10:35:16 +0200
Message-Id: <20902.861179716@munken.uninett.no>

Factoid:

UTF-8 is not user-friendly in 8859-1; the standard coding octets for
putting the 8859-1 charset into UTF-8 insert one character in front of
each character, and also change the last character for the 4 uppermost
columns of the 8859-1 character table.

So "Grtavr" (my wife's hometown) becomes "Gr,tavö" if you "forget"
to put the UTF-8 back into 8859-1 and just dump it to an 8859-1 screen.

=F8 = 1111 1000 -> 11000011 10111000 = C3 B8 = , (A with tilde + cedilla)
=E6 = 1110 0110 -> 11000011 10100110 = C3 B6 = ö (A with tilde + pilcrow)

(If some text has > 5% A-with-accents, it's probably UTF-8 encoded 8859-1....)

                           Harald A