Re: Does the Charset Registry still function?

You're right, I should have written "some questions".

The situation is that we have some 700ish mapping tables in ICU, and
many others in CDRA. While we do not need to register all of them,
there is a substantial number that we will need to register. The
problem is that many of them differ only to a small degree, in a
fashion that may or may not be permitted as a single registration
according to the RFC. So we need to get some clarity on what the RFC
requires before registering any of them.

A tight definition of a 'charset' would have *every* difference in
mappings count, including roundtrips, fallbacks, reverse fallbacks,
and extensions. A loose definition might only count roundtrip
mappings, and allow extensions (e.g. windows-1252 before and after the
Euro addition counting as the same registration, even though different
results are returned in mapping). There would be a very different set
of registrations resulting from which definition we should use (or
something inbetween).

Despite your recommendation, I am still rather reluctant to bombard
the list with all the different possible registrations. For everyone's
sake, it would really help to get some clarity first! That is why we
boiled down the questions to very simple examples. Each of those
examples actually illustrates a known problem, but is *much* easier to
comprehend than a multipage mapping table. If you think it would
facilitate matters, I could put out a separate email for each of the
questions.

Mark Davis

P.S. For detailed comparisons of code page mappings, see
http://oss.software.ibm.com/icu/charset/roundtripIndex.html

________
mark.davis@jtcsv.com
IBM, MS 50-2/B11, 5600 Cottle Rd, SJ CA 95193
(408) 256-3148
fax: (408) 256-0799

----- Original Message ----- 
From: "Martin Duerst" <duerst@w3.org>
To: "Mark Davis" <mark.davis@jtcsv.com>; <ietf-charsets@iana.org>
Sent: Thursday, May 01, 2003 11:53
Subject: Re: Does the Charset Registry still function?


> Hello Mark,
>
> At 13:59 03/04/30 -0700, Mark Davis wrote:
> >I sent in a question on April 22 to the registrar, and as yet see
no
> >answer.
>
> 'a question' is a very big understatement. It was a long list of
> questions, with additional material outside your mail.
> Also, please be aware of the fact that 'the registrar' is
> just a secretarial function, they don't have the necessary
> technical knowledge to answer such questions.
>
> I had a look at the questions, and the outside material, and
> I think that all your questions make sense, and should be answered
> by this list. But I was just overwhelmed by the number of questions,
> and had no idea where to start. (also, I was on a trip) So I gave
up.
>
> While I think that most of the questions you ask make sense, I think
> you should be aware that the answers may not be straightforward.
> Many of these questions just haven't been considered up to now.
>
> Also, you should be aware that, at least between updates to the
> relevant registration documents (RFC), this list (and similar ones)
> work in some sense similar to (UK or US) case law. I.e. if you want
> an answer to a question, you have to bring a case that exhibits this
> question (i.e. an actual registration). While this is not a
hard-and-
> fast condition for discussion on this list, it clearly makes
> discussion easier.
>
>
> >IBM is in the position of needing to register perhaps hundreds of
> >charsets, and we need better information before we start; otherwise
we
> >could end up with either redundant registrations or missing
charsets.
> >
> >BTW, the only mail archive I could find was on
>
>http://lists.w3.org/Archives/Public/ietf-charsets/2003AprJun/subject.
html,
> >but that archive is full of spam. Is there a better source?
>
> Sorry about that. I thought this had been dealt with, apparently
> it hasn't. I'll try to have it fixed.
>
> Regards,    Martin.
>
>
>
> >Mark
> >
> >==========
> >Previous Message:
> >==========
> >
> >Date: Tue, 22 Apr 2003 11:32:06 -0700
> >From: Mark Davis <mark.davis@jtcsv.com>
> >To: ietf-charsets@iana.org
> >Message-id: <00c801c308fd$7a88a730$7900a8c0@DAVIS1>
> >Subject: Charset Identity and Registration Questions
> >
> >We have a number of questions about the application of RFC 2978
that
> >are
> >important for resolving which charsets IBM should register. Since
this
> >is
> >much easier with a formatted document instead of plaintext email,
> >there is a
> >formatted document with the questions at the following address:
> >
>
>http://oss.software.ibm.com/cvs/icu/~checkout~/icuhtml/design/charset
_questi
> >ons.html
> >....
>
>

Received on Friday, 2 May 2003 17:31:43 UTC