Re: character encoding assumptions and approaches

 Matthew, I appreciate your thoughtful comments.

Matthew Dovey wrote:

> ....We have got into a muddle over syntax, schema, character set,
> functional subsets (best term I can come up with for the different
> flavours of MARC21 such as authority, bibliographic, holdings etc.) and
> probably language negotiation (which is distinct but different from
> character set) for retrieving records and perhaps we should re-engineer
> this from scratch with the benefit of hindsight for a Z39.50 Version 4.

I think we can re-engineer it in version 3; that's not what version 4 is
about.  When we've discussed "version 4", it's been about things that we
cannot reasonably extend version 3 to do, like support a completely
different data model, certain query functions, etc.

We're not talking here about any new functionality, the functionality is
there; people don't like the way it's been engineered.

We could define a "compSpec2" (with associated option bit).  We don't need
selectAlternativeSyntax, dbSpec, or even schema. espec, variant, etc.  We'd
like to specify a character encoding, maybe language, and perhaps a few
additional simple things.

But let's back up and remember what we're really trying to solve.

The pressing issue is the search term. Nobody has expressed an urgent need
to have the encoding problem solved for records.  We had thought that the
option bit would be a good way to solve both problems at the same time. Now
the discussion has taken a number of turns, but the most recent thinking
seems to be that the utf-8 bit, elegant as it seemed, may be too simplistic
because we don't understand what it would apply to (though I think we could
work that out). If we back out of the utf-8 bit though, we're back where we
started, without a way to know how a search term is encoded, so we'll need
to define an attribute (which we decided we didn't need if we have the utf-8
bit).

I think we have the following alternatives:
1. Just define an attribute and defer the retrieval problem.
2. Define an attribute, and either define a compspec-2 or develop the oid
approach, for retrieval.
3. Just define an option bit, which applies only to the term (defer the
retrieval problem).
4. Define an option bit for the search term, as well as for retrieval, and
a  "compspec-2" or oid to overide for retrieval.

And there may be other alternative that I've missed,

I'd like to add, looking at a compspec-2 vs. oid approach, aside from the
question of whether the oid approach is shortsighted (as I think it is) it
isn't going to be any easier to develop or implement it than compspec-2.

--Ray

Received on Wednesday, 6 March 2002 15:34:50 UTC