- From: Alan Kent <ajk@mds.rmit.edu.au>
- Date: Fri, 2 Apr 2004 09:44:16 +1000
- To: ZIG <www-zig@w3.org>
Hi all, <warning>Blue sky idea following.</warning> One of the problems in implementing a Z39.50 distributed search server is a search request has to return the exact number of hits in the search response packet. The exact number of records in the final result set has to be returned in the search response as there is no other way to tell the client later about a revised figure. This means a server cannot respond until all of the distributed searches it sends out return. The current solution is to put more complexity into the client software by letting it send the query to multiple servers and manage the responses incrementally coming back. One solution is to extend Z39.50 with an extra option bit to negotiate the capability for results to be made progressively available. The idea is that if a server responded, for example, with a search status of failure and result set status of in-progress (a new status not currently defined), the server is continuing the search in the background. A new mechanism would also be made available to retrieve details about existing result sets allowing at least the set status and size to be retrieved. This could be done, for example, as a new Explain category where you can query for all sets or a set with a specific name. Or a completely new request/response type could be introduced. This would also allow clients to ask what sets it currently had on a server. An alternative is to use an Extended Service to submit such search requests. This gives control over canceling the search operation and interrogating its current status. When created, the extended search request would contain a SearchRequest. When fetched, the task package would contain at least a SearchResponse. (Reusing the current ASN.1 constructs would hopefully simplify changing search clients who already have to construct and pull apart these structures.) The client can determine that the search has finished when the Task Status is completed. Yet another solution is to use say OtherInformation in a present request allowing a 'revised result set size' to be returned when you fetch records from an existing result set. There are lots of unresolved issues that one can think about. However, to me the first question is Is such a facility going to be any use in practice? Maybe getting the client software to do the distribution is the best solution. It works today with no protocol change. If a new facility was made available, would any of the clients support it? It moves effort from the client writers to the server writers (which I think is a good principle), but in practice are people still doing active Z39.50 client software development? My personal belief is that if Z39.50 was being designed again, it would be a logical thing to have included (somehow). But the reality today is that some clients can do this now so there is no real benefit in trying to do it in Z39.50. Anyone have a different opinion? Anyone think there is anything more to this than a purely intellectual exercise? I actually started from the Explain database (we support Explain) and how it may be useful to applications to see what result sets they currently had on the server. In some of our applications we have multiple independent libraries of code firing off queries down a shared connection. Finding out all sets that existed would be a useful facility for debugging or monitoring what was going on. Implementing a distributed search Z39.50 gateway also sounded fun to try, but would work badly at present I believe because all remote servers need to respond before the gateway can respond, and that would be too slow to be acceptable to users. Users want results shown progressively on their screens like current clients can do. Also note, there is no reason why the searches being forwarded by the server need to be Z39.50 requests. I am focussing here on the protocol between a client and a distributed search server. The same issues are true for trying to put a Z39.50 to Google protocol converter in place. The Google web interface changes its mind about how many records exist in response to a search after you start fetching records. Thanks Alan
Received on Thursday, 1 April 2004 18:44:21 UTC