Record Source schema -- feedback needed from Ray Denenberg on 2000-09-25 (www-zig@w3.org from September 2000)

From: Ray Denenberg <rden@loc.gov>
Date: Mon, 25 Sep 2000 16:03:32 -0400
To: "ZIG List (at W3C)" <www-zig@w3.org>
Message-ID: <39CFAF94.E3A8CCBC@rs8.loc.gov>

I need feedback on the Record Source schema.
http://lcweb.loc.gov/z3950/agency/defns/recsrc.html

I asked for feedback a month or so ago and got none. I will interpret a lack of
response now to mean that nobody cares about it any longer.

If you do care about it but think that further discussion and development should
be deferred, say, until the December ZIG meeting, when we will have a chance to
discuss related issues -- distributed searching and revising the Z39.50 URL --
please say so.

The issue I've raised, and want to re-iterate, is this:

Suppose two (or more) URIs are supplied by a surrogate record. Does this mean:

(a)  there are multiple URIs available to access this record (i.e. they all
identify the same source record, at  the same server); or

(b) the intermediary decided that two (or more) records on *different* source
servers are duplicates (leave aside for now what we mean by duplicate, and see
the discussion at the bottom of this note)  and so is supplying, within the
single surrogate record, URIs to these duplicate records on different servers.

The difference is that in (a) you give the client a choice of how to access a
given record, while in (b) you give the client the choice of where to access the
record from.

Clearly we want to accomodate case (a) and so the question is do we also want to
accomodate case (b)?

If not -- no problem, multiple URIs would always be interpreted to mean (a).
But if we do want to accomodate (b), there is no way, currently, to distinguish
this case:  when there are multiple URIs, how do you know whether this means (a)
or (b) (or a combination)?

This isn't a major technical problem and can be handled with a minor enhancement
to the schema (which would be appropriate to do now since it hasn't been
finalized) but I want to hear from someone, first, that this is a desired
feature.

Now to the question of duplicate detection itself.  There are two different ways
that duplicate detection can happen: (1) unilaterally by the intermediary, and
(2) by the intermediary in response to a Duplicate Detection request.  The
second raises questions for the Duplicate Detection service:  Currently, the
Duplicate Detection service assumes point-to-point, single server. What are the
implications for the Duplicate Detection service where the server is an
intermediary? It may be as simple as defining an appropriate Cluster syntax,
transparent to the Duplicate Detection service, but it would take some study.

In the "unilateral" case: if we develop a Cluster syntax, can we make this part
of the schema, that is, so that duplicate and cluster information can be
returned even though a Duplicate Detection request was never issued?



--
Ray Denenberg
Library of Congress
rden@loc.gov
202-707-5795

Received on Monday, 25 September 2000 16:03:05 UTC