W3C home > Mailing lists > Public > www-zig@w3.org > February 2002

Re: Z39.50 character encoding

From: Mike Taylor <mike@tecc.co.uk>
Date: Wed, 27 Feb 2002 15:01:48 GMT
Message-Id: <200202271501.PAA29712@wells.tecc.co.uk>
To: joe.zeeman@tlcdelivers.com
CC: www-zig@w3.org
> Date: Wed, 27 Feb 2002 08:52:41 -0500
> From: "Johan Zeeman" <joe.zeeman@tlcdelivers.com>
> 
> The problem is that MARC21 currently permits records to be encoded
> using either of 2 mutually incompatible character sets.  You can
> tell from inspection of the record leader what character set is
> being used.

OK, that's fine.  So if a Z39.50 server returns a MARC21 record, then
in providing the MARC21 OID for its EXTERNAL, it's telling the client
(among other things) that the record is in one or other of those two
mutually incompatible character sets, and that it has to inspect the
record leader to find out which one.

That's a part of what the MARC21 OID _means_.

> But Z39.50 provides no mechanism to ask for MARC records to be
> delivered with a specific character encoding (and neither does
> anything else, for that matter).

Well, we could use eSpec to allow a client to say to its server,
"Please give me MARC21 records in the first of the two possible
character sets".  The the server would serve up a record which,
hopefully, did just that.  But the client would still need to check
the record header in order to know (just as it needs to check the
EXTERNAL's OID to know whether its got a MARC21 record at all, as
opposed to a STURS record or something.)

> I agree with Ray that the OID for the record syntax does not imply a
> character set.

This is only half true.  Character set is _one_ of the many things
that a particular record syntax may specify, along with transfer
syntax, etc.  Or in MARC21's case (hopefully pathological) the MARC21
OID specifies a restriction of what character sets may be used, and
says how to tell which of the options is active.

Bottom line: the EXTERNAL's OID says what kind of object the record
is.  Some kinds of object (e.g. USMARC) specify a character set, and
others (GRS-1) do not.  Those which do, we must respect.  Those which
don't, we are at liberty to mess with: we can think about how we want
to treat those.

 _/|_	 _______________________________________________________________
/o ) \/  Mike Taylor   <mike@miketaylor.org.uk>   www.miketaylor.org.uk
)_v__/\  "So this guy in the 'No Smoking' compartment asks me if I
	 mind him smoking, and I say 'Not at all, do you mind if I
	 vomit?'" -- Tony Campolo.
Received on Wednesday, 27 February 2002 10:02:20 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 5 February 2014 07:13:27 UTC