- From: Bruce Lilly <blilly@verizon.net>
- Date: Thu, 29 Jan 2004 18:15:37 -0500
- To: ietf-charsets@iana.org
Several recent additions to the charset registry illustrate a number of issues. The specific entries I refer to are: Name: Amiga-1251 MIBenum: 2104 Source: See (http://www.amiga.ultranet.ru/Amiga-1251.html) Alias: Ami1251 Alias: Amiga1251 Alias: Ami-1251 (Aliases are provided for historical reasons and should not be used) Name: KOI7-switched MIBenum: 2105 Source: See <http://www.iana.org/assignments/charset-reg/KOI7-switched> Aliases: None Name: OSD_EBCDIC_DF04_15 MIBenum: 115 Source: Fujitsu-Siemens standard mainframe EBCDIC encoding Please see: <http://www.iana.org/assignments/charset-reg/OSD-EBCDIC-DF04-15> Alias: None Name: OSD_EBCDIC_DF03_IRV MIBenum: 116 Source: Fujitsu-Siemens standard mainframe EBCDIC encoding Please see: <http://www.iana.org/assignments/charset-reg/OSD-EBCDIC-DF03-IRV> Alias: None Name: OSD_EBCDIC_DF04_1 MIBenum: 117 Source: Fujitsu-Siemens standard mainframe EBCDIC encoding Please see: <http://www.iana.org/assignments/charset-reg/OSD-EBCDIC-DF04-1> Alias: None Also relevant is the following excerpt from the registry: The value space for MIBenum values has been divided into three regions. The first region (3-999) consists of coded character sets that have been standardized by some standard setting organization. This region is intended for standards that do not have subset implementations. The second region (1000-1999) is for the Unicode and ISO/IEC 10646 coded character sets together with a specification of a (set of) sub-repertoires that may occur. The third region (>1999) is intended for vendor specific coded character sets. Assigned MIB enum Numbers ------------------------- 0-2 Reserved 3-999 Set By Standards Organizations 1000-1999 Unicode / 10646 2000-2999 Vendor One issue is that the MIBenum values assigned to these charsets does not seem to be consistent with the description above and with the reference information at the indicated URIs. It appears that the last three are in fact vendor charsets and therefore should have MIBenum values in the 2000 to 2999 range. Conversely, it is not clear why KOI7-switched has been assigned a Vendor MIBenum value, nor which vendor might be responsible. Another issue is that the three OSD_EBCDIC_DF* charsets give no indication in the source documents as to whether or not the charsets are suitable for use with MIME text. Such an indication is supposed to be part of the registration (RFC 2978 section 5). A related issue is the fact that the registry itself provides no such indication for any charsets, which is at best highly inconvenient for implementors. None of the charsets above have been provided with an alias beginning with "cs" for use with the printer MIB as discussed in section 2.3 of RFC 2978. If that were consistently done, there would be no charset with a confusing Alias: None line in the registry. How can we minimize these issues in the future? I believe that use of RFC 2978 (or a successor) as a checklist during the review process would help. I believe that the addition to the registration template of a brief history of the charset origin (originator and affiliation) would help in determining whether a particular charset is a Vendor charset or Set By [a] Standards Organization[s]. Finally, inclusion of a "MIME-text" field in the registry with a yes/no value would not only be a boon to implementors of applications which use charsets in a MIME context, but would prompt IANA to obtain a statement of MIME text compatibility if it is lacking in the registration application.
Received on Thursday, 29 January 2004 18:17:28 UTC