W3C home > Mailing lists > Public > www-international@w3.org > October to December 2002

Re: 'x-' prefix on charset names

From: Dan Chiba <dan.chiba@oracle.com>
Date: Tue, 22 Oct 2002 11:33:01 -0700
Message-ID: <3DB599DD.427952F2@oracle.com>
To: Martin Duerst <duerst@w3.org>
CC: www-international@w3.org

Hello Martin, 

Thank you very much for your clarification. Could I have your 
comments in a little further details, please? 

There is no doubt about limited use of unregistered names and 
it is encouraged to use other options, but having said that, if 
one needed to opt for option c, would W3C suggest using a raw 
unregistered name rather than a name followed by x-? 

Major specifications such as HTTP and XML do not completely 
prohibit using unregistered names or arbitrary names. I was 
not sure if the description I cited implies recommending 
option c1 rather than c2, in addition to more preferred 
options like a and b. 

 c1. Use an arbitrary charset name without an x- prefix
 c2. Use the 'x-' convention


Martin Duerst wrote:
> Hello Dan,
> Many thanks for your question.
> At 14:11 02/10/21 -0700, Dan Chiba wrote:
> >Hello,
> >
> >I have a question regarding the 'x-' convention used to
> >indicate that a charset is not registered at the IANA registry.
> >Is it prohibited to use a unregistered charset at one's own risk?
> >
> >According to the latest CharMod paper, the convention is
> >discouraged as follows (Excerpt from Section 3.6.2):
> >
> >   [S] The 'x-' convention for unregistered character encoding
> >   names SHOULD NOT be used, having led to abuse in the past.
> >   ('x-' was used for character encodings that were widely used,
> >   even long after there was an official registration.)
> >
> >My question is about the intent of this is. If an unregistered
> >charset was used, you will be forced to avoid the convention
> >for complience. I think there are good reasons to avoid it, but
> >what should be the options to take?
> >
> >Among the following viable alternatives that I can think of, I
> >understand W3C is in the position of recommending option a and b.
> >
> >  a. Use a registered charset instead (May or maynot be feasible)
> >  b. Get the charset registered (May take time)
> >  c. Use the unregistered charset (Need bilateral agreement)
> >
> >It is not clear to me if W3C intend to prohibit option c. Could
> >somebody clarify the intent, please?
> I think your reading of what the Character Model says is correct.
> Opinion c) is not completely prohibited, but I think the cases
> where it could be used are very limited. I can imagine the
> following:
> - Some researchers are working on an encoding for Egyptian Hieroglyphs.
>    They want to work out the details before registering. So they
>    create something like x-hiero-test-1, x-hiero-test-2, and so on.
>    Once they think they know what they need, they register it, and
>    use the registered name.
> - A company wants to test their software with dummy data, and dummy
>    'charset's, e.g. to check how they can upgrade their software to
>    deal with new 'charset's. In this case, using x-dummy-1,... would
>    come in handy.
> There may be other, similar cases. But in general, go for a) or b).
> b) may indeed take some time, but it can be as short as two weeks
> (a minimum period of 2 weeks is necessary to give everybody a
> chance to comment).
> Regards,   Martin.
Received on Tuesday, 22 October 2002 14:34:41 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:40:47 UTC