W3C home > Mailing lists > Public > www-zig@w3.org > February 2002

Re: Z39.50 character encoding

From: Johan Zeeman <joe.zeeman@tlcdelivers.com>
Date: Tue, 26 Feb 2002 09:40:28 -0500
Message-ID: <00f301c1bed3$88892230$9539910c@unicity.tlcdelivers.com>
To: "Pieter Van Lierop" <pvanlierop@geac.fr>, <a.sanders@mcc.ac.uk>, "zig" <www-zig@w3.org>
But we have more "sophisticated" character set negotiation and hardly
anybody has implemented it.  I think we need to stop crafting perfect
solutions that end up being so complex that only the most dedicated few will
be prepared to implement them (witness Explain, GRS, attribute
architecture).  An option bit is simple, easy to understand, and addresses
80% or more of the need.  Those who require more sophistication can use
existing character set negotiation.

I agree, however, that an option bit cannot constrain the character set of
returned objects.  They are, after all, external to the protocol.  I'n not
sure I understand what the issue is with regard to the character set of
retrieved records.  Could it come from a requirement to be able to ask for
MARC21 records in Unicode rather than in MARC-8?

I vote for the option bit, but without the language about retrieved data.

j.

[second time sending this - first time went to Pieter only!]

----- Original Message -----
From: "Pieter Van Lierop" <pvanlierop@geac.fr>
To: <a.sanders@mcc.ac.uk>; "zig" <www-zig@w3.org>
Sent: Tuesday, February 26, 2002 4:55 AM
Subject: RE: Z39.50 character encoding


> I think the contents of a record is guided by the OID, not by Z39.50. As
far
> as I know record syntaxes come with their own character set. (UNIMARC with
> Extended Latin, MARC21 with ALA.) Probably this is not always true, but I
> really think that we ZIG should not involve ourselves with the returned
> records unless they are private Z39.50 like SUTRS.
>
> Geac is very much interested in this discussion. Currently we assume that
> all clients use Windows and therefore send ANSI characters. We also assume
> that the client prefers to receive ANSI (in the scan for example.) Of
course
> this is a bad solution but so far it works.
>
> I don't think a new attribute is a good solution because the problem is
> everywhere: ImplementationId/ImplementationName, Scan, CloseReason,
> Additional Diagnostic Information, ...
> I don't think an option bit is a good solution either. I really think we
> need something more sophisticated character set negotiation.
>
> We could discuss further offline with the people who have an interest in
> character set discussion
>
>
> Pieter van Lierop
> Geac
>
> > -----Message d'origine-----
> > De : Ashley Sanders [mailto:zzaascs@irwell.mimas.ac.uk]
> > Envoyé : mardi 26 février 2002 10:12
> > À : zig
> > Objet : Re: Z39.50 character encoding
> >
> >
> > Ray Denenberg wrote:
> >
> > > (a) Assign an option bit for utf-8 encoding.
> > > (b) Define an attribute for the encoding of a
> > > search term.
> > > (c) Do both.
> >
> > I guess I prefer an option bit. Howevever...
> >
> > > Option bit
> > > If this bit is negotiated it would pertain to
> > > retrieved data as well as the search term.
> >
> > ...if a utf-8 option is negotiated between origin
> > and target is it possible for me to subsequently
> > return a UKMARC (or other national format) record?
> > MARC21 is not a problem as you can have unicode
> > MARC21 records, but UKMARC records "use an
> > extended ASCII (8 bit) character set". There
> > is no MARC leader byte in UKMARC to indicate
> > an alternative unicode/utf-8 encoding.
> >
> > Would a request for a particular record syntax
> > override any utf-8 option bit set at Init time?
> >
> > Ashley.
> >
> > --
> > Ashley Sanders                                a.sanders@mcc.ac.uk
> > COPAC: A public bibliographic database from MIMAS, funded by JISC
> >              http://copac.ac.uk/ - copac@mimas.ac.uk
> >
Received on Tuesday, 26 February 2002 09:44:04 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 5 February 2014 07:13:27 UTC