- From: Martin J. Duerst <mduerst@ifi.unizh.ch>
- Date: Fri, 07 Feb 1997 14:29:40 +0100 (MET)
- To: Dan Oscarsson <Dan.Oscarsson@trab.se>
- Cc: unicore@Unicode.ORG, ietf-charsets@INNOSOFT.COM, David Goldsmith <goldsmith@apple.com>
On Fri, 7 Feb 1997 Dan Oscarson wrote: > > > But even if is is restricted to UCS is would work fine to use: > > > > > > Content-Type: text/plain; charset=UTF-7 > > > Content-Transfer-Encoding: 8bit > > > > > > > > > and only encode characters that can not be represented by 8 bits. > > > > Would work fine, eh? Who's going to figure out what the 8-bit > > characters are, and how? > If it is UCS, the 8-bit characters are in UCS, of course! (i.e. iso 8859-1) Dan - You came up with the idea to call ISO-8859-1 UCS-1. Please note that this is purely your idea, and that there is no standard that uses such a terminology or forsees such usage. And people with a wide global perspective don't think about iso-8859-1 as something that could be called "of course". In the same vein, we could call ASCII UCS-1 (well, exactly it would have to be UCS-0.875 :-), but that is definitely not intended. Also note that UCS stands for Universal Character Set. Apart from the question of whether we are the only writing species in the universe, UCS-2 (the BMP) is pretty much universal. ISO-8859-1 definitely does not deserve to be called universal. > > And then also something like > > Content-Type: text/plain; charset=iso-2022-jp > > Content-Transfer-Encoding: 8bit > > would have to mean something (because iso-2022-jp is a pure > > 7-bit encoding). Very strange indeed! > > > It is prefectely ok to use the above, even though there are no > 8-bit characters. You are allowed to specify > Content-Transfer-Encoding: 8bit > even if no 8-bit codes are used. If there are really no 8-bit characters, then that's not a major problem. It's not what MIME suggests to do, but it is acceptable. It is also acceptable for "charset=UTF-7". What is not acceptable is to suddenly try to fill in stuff into character encodings ("charset"s) that are purely 7-bit, as you have proposed above. If somebody things that we need a UCS-7 form that is compatible (more or less) with iso-8859-1, then with equal legitimation, there are many other legacy encodings that could use such a combination. But as UTF-7 already encodes all of UCS-2, and the relevant portions of UCS-4, there is no need for such strange combinations. Regards, Martin. --Boundary (ID uEbHHWxWEwCKT9wM3evJ5w)
Received on Friday, 7 February 1997 05:32:17 UTC