Re: character encoding assumptions and approaches

Pieter Van Lierop wrote:

> 1. I still think that the Z39.50 protocol should not bother with the
> contents of anything that is not defined in the Z39.50 protocol. For example
> a MARC record. From the point of view of a MARC record, Z39.50 is only a
> transport mechanism. The MARC syntaxes have their own committees, standards,
> protocols, traditions, national standards, international standards: we
> should not bother with that.

Sorry, Pieter, I don't follow. I'm not sure what "bother" means in this context.
Z39.50 should "enable", not "bother".  The protocol should facilitate
enforcement of the rules.


> 2. The character set agreement that we are discussing does not only imply to
> the search term, but to all fields defined as "International String". Is
> this correct or not?

No. The current thread of this discussion focuses on marc records, and they go
as external.  The agreement we're discussing is that if utf-8 is negotiated, and
if a server has a record to transfer that is or includes  text (i.e.
characters), and if the utf-8 negotiated has not been overiden for that record,
then the server will transfer it in utf-8.

>
> This means that, amongst others, the following fields are to be considered:
> ImplementationId, ImplementationName, ResultSetName/ResultSetId,
> DatabaseName, AdditionalInfo (in a diagnostic), ElementSetName, DisplayTerm
> (in Scan)

I would say that those which we have called "message strings" -- additionalInfo
in a diagnostic, elementSetName -- yes. Those that we have called "name string"
I don't think it matters.

> Actually, the Term (in Search and Scan) is generally considered to be an
> OCTET STRING. I believe that most client applications send it as an OCTET
> STRING.
> Does that mean that when the client application sends Term as an OCTET
> STRING, the character set agreement does *not* apply?

I don't know. It's a good question and we need to give it some thought.


--Ray

Received on Wednesday, 6 March 2002 13:28:16 UTC