W3C home > Mailing lists > Public > www-zig@w3.org > January 2009

Re: Requesting XML records via Z39.50

From: Edward C. Zimmermann <edz@bsn.com>
Date: Tue, 27 Jan 2009 13:38:51 +0100
To: "Ray Denenberg, Library of Congress" <rden@loc.gov>, Archie Warnock <warnock@awcubed.com>
Cc: www-zig@w3.org
Message-Id: <20090127082301.M95703@nonmonotonic.net>

On Mon, 26 Jan 2009 14:24:24 -0500, Archie Warnock wrote
> Ray Denenberg, Library of Congress wrote:
> > I would like to revisit the implementor agreement on "Requesting XML
> > Records",  http://www.loc.gov/z3950/agency/agree/request-xml.html, as it
> > has been many years since it we've discussed it, and it does seem to
> > warrant some clarification.
> 
> And note that the link in that page
> 
> (http://www.loc.gov/z3950/agency/zing/srw/records.html) is no longer 
> valid.
> 
> > Briefly,  to retrieve records according to a specific XML schema using
> > Z39.50 (if you DON'T want to use compSpec):
> > 1. XML is specified as the record syntax,  specifically 'xml-b':
> > 1.2.840.10003.5.112.
> > 2. The schema identifier is specified as the element set name.
> 
> Somehow I missed the original Implementor's Agreement and Isite has been
> happily chugging along without it.  We don't use compSpec in Isite 
> but the majority of uses are homogeneous enough that we haven't had 
> to (nor been asked to) rely on the agreement.  The old XML OID 
> (we've been using
> 1.2.840.10003.5.109.10) is sufficient for us - I just return the

Given that we have a lot of common genes to Isite/Isearch--- just 
heavily mutated over the years into a new species-- I've been using these
as well..
  {sgmlRecordSyntax,                           "1.2.840.10003.5.109.9"},
  {XmlRecordSyntax,                            "1.2.840.10003.5.109.10"},
  {applicationXMLRecordSyntax,                 "1.2.840.10003.5.109.11"},

> only XML we know about - ie, the record we ingested.  This works reasonably

I, however, don't just deliver the record as ingested unless, of course, the
record ingested was XML or SGML but, if possible, I try to on-the-fly create
a XML "representation"--- or whatever Record Syntax was requested---- of the
record content as "stored" (which, in turns, depends upon how it was
"ingested"). The XML schema/DTDs are, of course, only what they have
wired in. For conversions to other Schemas/DTD--- or even other formats
or variations from those whose Record Syntaxes we handle--- we have a design
that allows for external methods/programs/functions to be specified. This has
gotten a lot of use for HTML within workflow solutions but I don't think
anyone has bothered to use it to get a record based upon a different XML
Schema/DTD from the one normally returned. My wired in XML for eMail is
different from  MarkLogic's but in typical use I don't think it matters to
anyone other than us (and them). I don't quite see the business case for
search-time arbitrary schema conversion requests by users--- and nobody has
ever asked for it and we've had some pretty wild (and ill thought out)
requests over the years :-) 

The element set name in our case is the element set (or path to a fragment)
and NOT the schema identifier. In projects were we have needed multiple
schema we have deployed private OIDs like 1.2.840.10003.5.1000.34.3,
1.2.840.10003.5.1000.34.4.1, 1.2.840.10003.5.1000.34.4.2, 
1.2.840.10003.5.1000.34.4.3 ...

Paths/Fragments etc. are important to us since we don't always work with
records as the unit of recall but either user specified (directions or paths)
or heuristically (search time) determined paths.

> well since the schema is usually either known or agreed to a priori 
> or included in the XML anyway, in which case the returned XML is 
> self-documenting and it's up to the requesting client what to do 
> with it.
As the original intent :-)


--

Edward C. Zimmermann, NONMONOTONIC LAB
Basis Systeme netzwerk, Munich Ges. des buergerl. Rechts
Office Leo (R&D):
   Leopoldstrasse 53-55, D-80802 Munich,
   Federal Republic of Germany
http://www.nonmonotonic.net
Umsatz-St-ID: DE130492967
Received on Tuesday, 27 January 2009 14:41:41 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 29 October 2009 06:12:26 GMT