W3C home > Mailing lists > Public > ietf-charsets@w3.org > July to September 2000

Re: Last Call: IANA Charset Registration Procedures to BCP

From: <ned.freed@innosoft.com>
Date: Sat, 08 Jul 2000 07:31:53 -0700 (PDT)
To: "Martin J. Duerst" <duerst@w3.org>
Cc: ietf-charsets@iana.org
Message-id: <01JRILU3OO7M0006D9@mauve.mrochek.com>
> It looks like by chance I'm just barely in time for this last call.

> Some comments, none major showstoppers:

Good to hear. I'm basically looking at this as "do the ones that are
OK to handle as last call comments". Stuff that requires discussion or
a new last call needs to wait for the next time around.

> - The first sentence on page 3 has some strange repetition:
>    'to allow charsets' in the first line; 'to be used as charsets' at the end.


> - Section 2.5: 'A given CES is typically associated with a single CCS':
>    For the UTF-8 example, this is definitely true (and helpful to be
>    pointed out). For others CESs, I think it's wrong. The classical
>    example is what I would call here the 'identity 8-bit CES', which
>    is used in the whole ISO-8859-X series and many other cases.

I'll change it from "typically" to "sometimes".

> - Section 3.1, 'All registered charsets MUST note whether or not they
>    are suitable for use in MIME text.' and Sec. 6, 'Suitability for
>    use in Mime text:': As recent discussion has shown, this may
>    benefit from some careful clarification, but I guess this already
>    is on your todo list.

It is on my to-do list, but it isn't something I want to address in this
registration procedure update.

> - 3.1 'constructed as a composition of a CCS and a CES...': CCS
>    appears three times in singular. There are many charsets where
>    a CES combines more than one CCS. Same problem again in the
>    registration template.

Good point. I'll make them plural.

> - 'All registered charsets MUST be specified in a *stable*':
>    What about extensions, such as for ISO 10646? What about
>    variants, such as for Shift_JIS (vendor extensions as well
>    as mapping variants, for the later see
>    e.g. http://www.w3.org/TR/japanese-xml/).

What about them? The requirement is for a stable specification and nothing
more. Updating the specification is OK as long as whatever rules that
specification gives for updating are followed. (We have a detailed set of
such rules for UTF-8, for example.)

The primary point here is that we don't want a definition based on a pointer to
a document which then vanishes, leaving us with a name and nothing else.

> - 3.3 talks about names, and a primary name. The registry uses
>    the terms 'name', 'alias', and also 'preferred MIME name'.
>    It would be very helpful if the terminology were unified.

That's an issue for the registry, not for the present document.

> - 3.5 'ONLY if it adds significant value': I have heard from
>    some people that this strict requirement has lead to some
>    contraproductive effects, in particular things being used
>    with x- prefixes for years. I think it would make sense to
>    tone this down a bit.

If you were talking about MIME types I might agree. But this is charsets, and
we've seen no reluctance in this space to register everything under the sun.

And perhaps more to the point, we had a lot of trouble reconciling the views of
people who feel that charset registration should be restricted whenever
possible and people who feel that registering everything that comes along is a
good idea. This language is the carefully crafted result -- I don't think it
would be appropriate for me to change it based on your say-so.

In any case, if you feel this is worth pursuing I would suggest that you
bring up specific examples of where this language has caused harm and see if
a consensus can be reach to change it.

> - 3.6: The requirement of documenting the mapping to ISO 10646
>    where possible is great. It may be worth for the IETF to look
>    at formats to do this in a machine-readable way. For an example,
>    please have a look at http://www.unicode.org/unicode/reports/tr22/.

This is beyond the scope of this document.

> - Section 4.1: 'The intent of the public posting is to solicit
>    comments and feedback on the definition of the charset and
>    the name choosen for it over a two week period.': At the
>    end of the sentence, the references are not exactly clear.
>    The name choosen for the definition of the charset?
>    The name choosen for a two week period?
>    This can easily be improved, I guess.


> - 4.2: 'Decisions made by the reviewer must be posted to the ietf-
>    charsets mailing within 14 days.': Within 14 days of the decision?
>    That would be pretty easy; nobody can prove to the reviewer that
>    he made a decision three months ago if he forgot to post it and
>    claims that it took him three month to actually take the decision :-).

Well, it is pretty obvious that the intent is two weeks after submission to the
reviewer. But I'll clean up the language.

> - 5. The text about the 'Assigned Numbers' RFC reads hopelessly
>    outdated in the age of the web, and rather strange given that
>    the last such RFC was issued in 1994.

This is outside the scope of this document.

> - 5. Given some recent explanations by Harald about how to coordinate
>    charset registration and RFC publication, it seems to be a good
>    idea to explain that roughly, so that further registrants don't
>    start from the wrong end.

This will have to wait for the next revision.

> - 6. 'A URL to a specification' -> 'A URI to a specification'.


> - [ISO-8859]: Lots of updates here, please see e.g.
>    http://www.iso.ch/isob/switch-engine-cate.pl?searchtype=general&KEYWORDS=
> http://www.iso.ch/isob/switch-engine-cate.pl?searchtype=general&KEYWORDS=8859


> - [ISO-10646]: Please update this to refer to the 2000 version!

The ISO catalogue doesn't list it yet so I didn't update it. If I
get a reference before publication I'll handle it as an author edit.

Received on Saturday, 8 July 2000 17:37:45 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:52:17 UTC