RE: Registration of a new charset

> I'm the reviewer...

> 2 pieces of information I like to have in a registration:

> - Suitability for MIME text encoding: (Yes/No)
>   (I think yes - it has CR and LF in the obvious places)

Yes it does have these in the right place, but that's not sufficient for MIME.
The other requirement is that NULL not be used. Unfortunately, in GSM NULL is
where the at-sign character lives.

> - Whether a mapping to Unicode exists, and if so, where.
>   (is that character at 0x09 a C-cedilla or a C with-hook?

Another question with this particular character is whether it is an upper or
lower case C. It looks like an upper case C to me in the chart, but I question
the wisdom of having an upper case C-cedilla but no lower case C-cedilla.

>    Upper or lower case? details are out to get you...but "no" is a fine
>    answer...)

> A small detail is that the GSM default charset (yes, I read the reference -
> ETSI has a sensible distribution policy) is a 7-bit character set; it is
> "obvious but worth stating" that when this character set is used, the
> character is carried in an 8-bit byte, with the character in the lower 7
> bits, and the 8th bit is zero (unlike SMS, which jams 8 characters in 7
> bytes).

Agreed, although there is apparently similar 7bit squeezing going on in
other contexts. Go figure.


				Ned

Received on Wednesday, 26 September 2001 16:57:21 UTC