RE: [WSTF] My view... [long]

I agree that charset has no place in a locale. I think we're beating the
proverbial dead horse here.

There may be a place for charset for interoperability purposes. The question
is whether we can make a sensible compromise (something that meets some kind
of need) or if charset negotiation is beyond the scope of our current
efforts. I merely proposed it as "one of the things we might need to
exchange". We should decide how to deal with it.

It may be that "charset considered harmful" would be a better focus for our
efforts here. Another possibility would be to make a more definitive
statement about the use of Unicode in Web services. If Web services are to
insulate their callers from the implementation details of the back-end,
there should probably be a more definitive statement about how to do that.
Sort of a "best practices when dealing with charsets".

If we did it that way, the problems cited elsewhere would still exist, but
the solution would be implementation defined in a way that hides the details
from "public view".

In any event, we need a) some usage scenarios to show what the problems
might be and b) some proposed requirements for dealing with it/them. If
contributors (including myself) can be found, then we can make progress on
this issue. Otherwise it is a random aside... but I suspect that this issue
will have to be dealt with at some point. Better for us to be prepared for
it.

thanks,

Addison

> -----Original Message-----
> From: Mark Davis [mailto:mark.davis@jtcsv.com]
> Sent: Monday, February 03, 2003 12:02 PM
> To: Addison Phillips [wM]; Paul Deuter; Martin Duerst
> Cc: public-i18n-ws@w3.org; debasish@us.ibm.com
> Subject: Re: [WSTF] My view... [long]
>
>
> I still don't see the need for the charset; and it adds more
> opportunity for
> things to go wrong. In any even, if your POSIX implementation receives a
> locale tag without a charset, it must have backup; the same happens if it
> receives a locale tag with an unsupported combination, like en_US@SJIS.
> Moreover, since the IANA charset name is non unambiguous, as we all know,
> you have the issue of a locale id might be misidentified.
>
> The charset is really an orthogonal attribute, with nothing to do with
> cultural conventions.
>
> Mark
> ________
> mark.davis@jtcsv.com
> IBM, MS 50-2/B11, 5600 Cottle Rd, SJ CA 95193
> (408) 256-3148
> fax: (408) 256-0799
>
> ----- Original Message -----
> From: "Martin Duerst" <duerst@w3.org>
> To: "Addison Phillips [wM]" <aphillips@webmethods.com>; "Paul Deuter"
> <PaulD@plumtree.com>
> Cc: "Mark Davis" <mark.davis@jtcsv.com>; <public-i18n-ws@w3.org>;
> <debasish@us.ibm.com>
> Sent: Friday, January 31, 2003 11:32
> Subject: Re: [WSTF] My view... [long]
>
>
> > At 09:46 03/01/31 -0800, Addison Phillips [wM] wrote:
> >
> > >Okay, so my Java program instantiates the locale "zh-CN"... was anyone
> > >hurt by this? The charset isn't actually a locale member in
> Java and has
> > >no meaning in Java. It might be used to affect a byte-oriented
> interaction
> > >(probably in a very negative way).
> > >
> > >What about a POSIX system? It's running "zh-CN.GBK@pinyin".
> Does it hurt
> > >to omit some of that information when invoking remote Web
> services? What
> > >happens if that invoked Web service is (a wrapper around) another POSIX
> > >program? Does the loss of information affect the outcome?
> > >
> > >I'm not sure, but it might if the collation is "binary" and the charset
> is
> > >changed from GBK to UTF-8.
> >
> > Hello Addison,
> >
> > Would it be possible for you to describe some usage scenarios
> > where the answer to your questions 'does it hurt?' is 'yes'?
> >
> > I think if we have concrete usage scenarios, that will help
> > move our document forward and focus the discussion.
> >
> > Regards,    Martin.
> >

Received on Monday, 3 February 2003 19:19:09 UTC