RE: Distributed searches in Z39.50? from NJ Rogers, Learning and Research Technology on 2004-04-19 (www-zig@w3.org from April 2004)

From: NJ Rogers, Learning and Research Technology <Nikki.Rogers@bristol.ac.uk>
Date: Mon, 19 Apr 2004 13:07:25 +0100
To: Alex Khokhlov <alex@lib.msu.ru>, 'Alan Kent' <ajk@mds.rmit.edu.au>, www-zig@w3.org
Message-ID: <78640000.1082376445@ilrt-haako.ilrt.bris.ac.uk>
Hi

>
>> My curiosity was a little different - I was wondering how to expose
>> a Z39.50 interface instead of a web interface. That is, develop a
>> server that allowed Z39.50 clients to access it where the server
>> forwarded requests on to all the remote servers, did the query
>> translations, record normalization, etc.
>
Just for information - some projects here in the UK (HE/FE) that are 
probably relevant to your discussion:

1) The EDINA GetRef service
http://edina.ac.uk/getref/
(was previously the Xgrain project - 
http://edina.ed.ac.uk/projects/joinup/xgrain/)
This represents a broker service - to remote Z39.50 and other services.

2) The Subject Portals Project
http://www.portal.ac.uk/spp/

I am a developer on 2). We have developed a xsearch portlet for 2), the 
functionality of which will ultimately be abstracted. Hence it will 
represent a cross-searching client in essence, capable of concurrently 
searching Z39.50 (and other types of) targets.  In developing this xsearch 
portlet we have also targeted 1) as a means of including additional 
multiple targets via a broker. There have been lots of issues along the 
way, these will get documented as project outputs and I believe the source 
code will be made freely available (not sure which license yet!) at the 
project end.  The project (2) will conclude August of this year.

Nikki





> Yes, I see your point. That task is a little bit different - read
> further, I have some interesting information for you about this kind of
> problem.
>
>> The problem was that I was not sure how to use Z39.50 and get progressive
>> results returned to a client - the client does not want to wait for
>> all responses as that would be too slow. The answer is that there
>> is a way to do it in Z39.50 using resource reports and concurrent
>> operations (but no client exists that uses the capability).
>
> My question is: is it really worth extending and tweaking Z39.50 into some
> kind of a metasearch protocol? It's too overcomplicated already even with
> simple 'search' & 'present' operations, additional complexity would not do
> any good.
>
> I would suggest looking at other opportunities that are under development
> and will be soon available for you to use. I hope that NISO MetaSearch
> initiative will soon produce good specs and guidelines to follow
> (http://www.niso.org/committees/MetaSearch-info.html).
>
>> Is there a benefit over just having the clients do the distributed
>> search directly (like what you have done)? It is not clear to me
>> that there is a benefit. If it was easy to do within the protocol,
>> then client writers might implement something. If its tricky, I think
>> client writers would never bother (or if they did bother, they would
>> go to the effort you have and do the distribution in the client).
>
> I don't agree with you here. Look at the history of Internet: are
> HTTP/FTP/SMTP/POP3/HTML/XML and hundreds of other protocols complicated?
> No, they are not. They are as simple as they ever could be - that's why
> they are so widely spread and flourished for many years. Sophistication
> comes from the area of protocol appliances, just as it should be, like
> WWW/WebServices/E-Mail clients and so on...
>
>> I was wondering if a simpler protocol approach that does not introduce
>> concurrent operations into the mix (for clients - I don't care about
>> servers) existed. I think a simpler approach may be necessary to
>> result in a simple ZOOM API. As soon as you require multiple threads,
>> async operations, etc, I think one of the goals of ZOOM (simple API
>> for programmers to use) will start to disappear. But maybe there is
>> a way to use concurrent operations etc under a ZOOM API without exposing
>> that complexity to the programmer using the API.
>
> Have a look at test metasearch implementation I've done for the purpose of
> testing an idea of simple metasearching: http://www.sigla.ru/sru-test.jsp.
> It allows you to query Sigla via extended SRU protocol and search in many
> catalogs. The available results are returned immediately with an
> x-finished element at the end. If it's not set to 'true', then you should
> query Sigla once again to get an updated view of the distributed search
> (don't worry - connection pooling makes sure that old connections are
> reused for that purpose). Then you can take one of the non-empty catalog
> results and type in the x-collection value into the form and set a
> 'searchRetrieve' operation - click 'search' and you will get records in
> MarcXML.
>
> As far as I understood from your letter - this is exactly the thing you
> are looking for, but it's not standard - just a test of an idea. Maybe it
> will somehow be standardized in the future.
>
>> But maybe its always going to be tricky - clients have to progressively
>> display results and let users view them. The protocol aspect is only
>> one part of the complete problem that the client writer has to address.
>
> Well, sooner or later you'll have to deal with the nature of the
> distributed search. :) But in the solution I described in the previous
> paragraph you can just reformat current XML result into some HTML file
> and don't bother about progressive display. If you want to update the
> status of search - just click refresh (of course it's not too efficient,
> but it's simple and it will work).
>
>> Thanks to everyone else who replied too. Interesting stuff.
>
> I'm also very interested in any protocol developments concerning
> metasearching in many heterogeneous sources. It was very interesting for
> me to read your replies, thanks for everyone!
>
> BR, Alex Khokhlov.
>
>



----------------------
NJ Rogers, Technical Researcher
(Semantic Web Applications Developer)
Institute for Learning and Research Technology (ILRT)
Email:nikki.rogers@bristol.ac.uk
Tel: +44(0)117 9287096 (Direct)
Tel: +44(0)117 9287193 (Office)
Received on Monday, 19 April 2004 08:07:03 UTC