Re: revised "generic syntax" internet draft


UTF-8 is not user-friendly in 8859-1; the standard coding octets for
putting the 8859-1 charset into UTF-8 insert one character in front of
each character, and also change the last character for the 4 uppermost
columns of the 8859-1 character table.

So "Grøtavær" (my wife's hometown) becomes "GrÃ,tavö" if you "forget"
to put the UTF-8 back into 8859-1 and just dump it to an 8859-1 screen.

ø=F8 = 1111 1000 -> 11000011 10111000 = C3 B8 = Ã, (A with tilde + cedilla)
æ=E6 = 1110 0110 -> 11000011 10100110 = C3 B6 = ö (A with tilde + pilcrow)

(If some text has > 5% A-with-accents, it's probably UTF-8 encoded 8859-1....)

                           Harald A

Received on Wednesday, 16 April 1997 04:34:44 UTC