- From: <Harald.T.Alvestrand@uninett.no>
- Date: Wed, 16 Apr 1997 10:35:16 +0200
- To: John C Klensin <klensin@mci.net>
- Cc: Dan Oscarsson <Dan.Oscarsson@trab.se>, uri@bunyip.com, fielding@kiwi.ICS.UCI.EDU
Factoid: UTF-8 is not user-friendly in 8859-1; the standard coding octets for putting the 8859-1 charset into UTF-8 insert one character in front of each character, and also change the last character for the 4 uppermost columns of the 8859-1 character table. So "Grøtavær" (my wife's hometown) becomes "GrÃ,tavö" if you "forget" to put the UTF-8 back into 8859-1 and just dump it to an 8859-1 screen. ø=F8 = 1111 1000 -> 11000011 10111000 = C3 B8 = Ã, (A with tilde + cedilla) æ=E6 = 1110 0110 -> 11000011 10100110 = C3 B6 = ö (A with tilde + pilcrow) (If some text has > 5% A-with-accents, it's probably UTF-8 encoded 8859-1....) Harald A
Received on Wednesday, 16 April 1997 04:34:44 UTC