Re: Registration of new charset SCSU from Markus Scherer on 2000-06-13 (ietf-charsets@w3.org from April to June 2000)

From: Markus Scherer <markus.scherer@jtcsv.com>
Date: Tue, 13 Jun 2000 16:52:00 -0700
To: charsets <ietf-charsets@iana.org>
Message-id: <3946C920.DBD9D952@jtcsv.com>

Harald Tveit Alvestrand wrote:
> - the "charset" SCSU is, as far as I can see, the combination of the CCS
>    UNICODE/ISO 10646 with the CES of UTR #6.
>    This should be expressed clearly in the "Published specification" section;
>    as currently written, it sounds like you're registering the CES only,
>    which is a no-no.

This is a good way to put it. I will update my proposal.

> - I would like to add under "Additional information":
>    SCSU is completely useless for applications that require a canonical
>    representation of text. This is an intentional part of its design.

Well, I will try to find a somewhat nicer way to say this... :-)

You are right. The intention behind SCSU is not to have an encoding that is good for internal processing; the intention is to have an encoding of Unicode that is more compact than UTF-8 or UTF-16 and that is useful in files (beware of searching though) and especially in protocols.

For example, the XML parser that I know always converts everything from the document charset into UTF-16 before it does anything else. It does not care about such issues.

> I don't like it much as a general purpose tool, but it may find a market
> niche somewhere.

That's fair.

What is the next step? Do I need to update and resend the proposal? Should I wait for a few days to give more people time to respond?

markus

Received on Tuesday, 13 June 2000 19:57:32 UTC