Re: character encoding ISO 2022 from Ian Ibbotson on 2002-03-13 (www-zig@w3.org from March 2002)

From: Ian Ibbotson <ian.ibbotson@k-int.com>
Date: 13 Mar 2002 10:46:44 +0000
To: Pieter Van Lierop <pvanlierop@geac.fr>
Cc: zig <www-zig@w3.org>
Message-Id: <1016016404.2456.15.camel@lynx.internal.k-int.com>

Well, I'd avoided commenting because this whole area seems so
complicated and there already seem to be many alternatives... But...
Adam from Index Data and I were lucky enough to spend some time working
in thailand on cp876 (Thai) character set based systems last week and,
in the end, I think we both thought that a simple sequence of proposed
character set names, and a response containing the one to use might be
by far the most simple and flexible approach (Given that in our local
situation we had to negotiate to 8-bit cp874). 

I guess this is nice, because sending "UTF-8" as the only entry in a
list and getting back "UTF-8" as a response is reasonably close to the
init option bit?

Ian.

On Wed, 2002-03-13 at 10:28, Pieter Van Lierop wrote:
> Please forgive my ignorance but what is ISO 2022 exactly?
> 
> The choice in the character set negotiation is between:
> ISO2022
> ISO10646
> Private
> 
> ISO2022, as I understand it, is an encapsulation of all classic 7-bits and
> 8-bits character sets.
> How many applications use ISO2022?
> How do I say I send Ascii, or Latin-1?
> 
> Wouldn't it be better, instead of ISO 2022, to make a list (extendable) of
> character sets used? We could give them OID's.
> I think we need the following:
> 
> ASCII
> Extended ASCII
> ANSI
> ALA
> Latin-1
> Extended-Latin
> ...
> 
> probably a few more
> 
> 
> Pieter van Lierop
> 
> > -----Message d'origine-----
> > De : LeVan,Ralph [mailto:levan@oclc.org]
> > Envoyé : mardi 12 mars 2002 20:59
> > À : zig
> > Objet : RE: back to character encoding
> > 
> > 
> > Yes!  It should apply to the OctetString version of Term.  
> > (It does in my
> > server.)
> > 
> > Ralph
> > 
> > > -----Original Message-----
> > > From: Ray Denenberg [mailto:rden@loc.gov]
> > > Sent: Tuesday, March 12, 2002 9:56 AM
> > > To: zig
> > > Subject: Re: back to character encoding
> > > 
> > > 
> > > Pieter Van Lierop wrote:
> > > 
> > > > I agree. But I would suggest to explicitly include Search 
> > > Term when it is
> > > > defined as OCTET STRING.
> > > 
> > > I understand your concern, Pieter, but I don't see any easy 
> > > way to accomplish
> > > this with the existing character set negotiation definition, 
> > > short of amending
> > > it (again!). If there is popular support for doing this, 
> > > folks need to speak
> > > up.
> > > 
> > > --Ray
> > > 
> > > 
> > 
> 
> 
-- 
Ian Ibbotson (ian.ibbotson@k-int.com)
Knowledge Integration Ltd
Sheffield Science & Technology Parks
Cooper Buildings
Arundel Street
Sheffield
S1 2NS
http://www.k-int.com

Received on Wednesday, 13 March 2002 05:47:16 UTC