- From: Markus Scherer <markus.scherer@jtcsv.com>
- Date: Wed, 28 Aug 2002 11:35:10 -0700
- To: charsets <ietf-charsets@iana.org>
I looked up some names in the IANA charset names list, and I suspect that some names are really only repertoires (collections) of abstract characters. Without any specified encoding scheme, they would not qualify as charsets. I wonder if they were intended to be (or are in fact) used with particular encoding schemes. Who registered these? Is an encoding scheme implied with each of these? If so, which one? Are they actually used? How/with which encoding scheme(s)? One could of course imagine to use one of the UTF-16/BE/LE or UTF-8 or other encoding schemes with each of the names, but is this intended or actually done? For the following names, the registration texts say "... subset of Unicode" and refer to ISO 10646 character collections. ISO-10646-UCS-Basic ISO-10646-Unicode-Latin1 For the following names, the registration texts refer to IBM GCSGIDs (Graphic Character Set Global Identifiers), which are IDs for repertoires without any implied encoding. There are no IBM CCSIDs (charset numbers) with these numeric values. An exception is number 1276, which does exist as both a CCSID number and a GCSGID number. As a GCSGID, it matches the registration text referring to cyrillic/greek, while CCSID 1276 is "Adobe Standard". ISO-Unicode-IBM-1261 ISO-Unicode-IBM-1268 ISO-Unicode-IBM-1276 ISO-Unicode-IBM-1264 ISO-Unicode-IBM-1265 Thanks and best regards, markus
Received on Wednesday, 28 August 2002 14:35:29 UTC