Internet Client Search Freeware

Eliot Christian (echristi@usgs.gov)
Fri, 13 Oct 1995 05:21:16 -0400


Message-Id: <9510131016.AA24162@mocha.bunyip.com>
Date: Fri, 13 Oct 1995 05:21:16 -0400
To: gils@cni.org
From: Eliot Christian <echristi@usgs.gov>
Subject: Internet Client Search Freeware

We're announcing here an alpha release of freeware intended to handle World
Wide Web client searching. This software uses the standard search protocol
known in the U.S. as ANSI Z39.50 and internationally as ISO 10163. (Support
for searching via the WAIS protocol, sometimes referred to as Z39.50 version
1, should be available soon.)

This software is intended to address a common problem when people go looking
for information--the sources all have different ways of searching. Some are
set up for novices, but these can be frustrating because they're not very
precise. Others are more precise, but you need information specialist skills
to use them. And, if you don't happen to be skilled in English, there are
few sources accessible in your native language.

These difficulties in searching for information are not well addressed even
if you limit your searching to the current Internet-accessible servers. If
all you have is a Web browser that understands telnet, FTP, gopher, and
HTTP, you have very limited control of your searching. 

There is a great need for a common search protocol that servers support
ubiquitously, and that intelligent clients can use. Then the client can do a
much better job of gathering information in ways that fit the particular
needs of specific searchers.

One solution is to use the Z39.50 client/server search protocol. This is the
protocol already supported by hundreds of library databases offering access
to billions of dollars worth of bibliographic catalogs. It is also the
protocol required for use by all U.S. Federal government agencies, under
public law 44 USC 3511 which established the Government Information Locator
Service (see http://www.usgs.gov/gils). Other massive applications using
Z39.50 are also in progress.
   
This initial freeware implementation is constructed as an add-on to the
Netscape WWW browser in Windows (Windows 3.11, Windows95, and Windows NT).
We anticipate that other developers will extend this client approach to
other WWW browsers, make similar clients for the Macintosh and Unix
platforms, and take advantage of CORBA and Java technologies. 

Using Microsoft's OLE (object linking and embedding), this software is
invoked within the WWW browser in response to a URL that refers to a
supported search protocol (the software registers the protocols "wais",
"z3950r", "z3950s", and "search"). As pointed out by Jim Perkins at the
Library of Congress, using OLE opens up the possibility of embedding
Internet search functions into all sorts of applications (e-mail,
spreadsheets, databases, travel planners, ...). After all, people don't
search for information as an end in itself, they are usually trying to
accomplish some other goal.

Because the software uses the computer to computer Z39.50 protocol to
actually conduct the search, the client can be completely customized to the
specific needs of the searcher. This is especially attractive when you
consider the alternatives--either searching many servers with different HTML
forms, or using a Web crawler designed to be "one size fits all". 

The software produces a user interface by constructing HTML forms on the
local disk drive. Since this is done locally on the client, one could
customize the interface for novices or experts, and in whatever language you
want. Also, for cases where the human user isn't immediately present, the
software could gather information on its own and construct a new
database--perhaps to create a personalized newspaper. 

Although simple text search and retrieval is available on all Z39.50
servers, servers differ quite a bit when you start to use more precise
searching techniques. For example, Z39.50 servers may have different
elements that are searchable and may support features like stemming of
search terms or phonetic spellings. This alpha software supports having
multiple configurations that specify these differences. An obvious
enhancement would be for the client software to obtain these configuration
specifics dynamically through the Z39.50 Explain facility on compliant servers.

Another obvious enhancement would be to take advantage of Z39.50 to conduct
distributed search among multiple servers. This alpha software just searches
one database at a time. 

In anticipation of servers now becoming available for geospatial searching,
the alpha software helps the user to designate spatial coordinates (decimal
latitude/longitude value) as search criteria. A geospatial server conforming
to the GEO Profile of Z39.50 is under construction by the Clearinghouse for
Networked Information Discovery and Retrieval. 

It is not hard to envision more general kinds of pattern-matching building
on this idea of standardized client-server approach for information search,
e.g., gene sequences, fingerprints, faces, images of the Earth. 

Although this software is not yet a fully developed product, you are welcome
to check out what's been built so far--especially if you are interested in
helping to elaborate this approach. The primary developer is Jeff Gelbard,
subcontracted through the University of Massachusettes, Center for
Intelligent Information Retrieval (CIIR), which has a development contract
with the U.S. Defense Technical Information Center (DTIC). 

The Z39.50 protocol stack code is the property of Ameritech but is being
made available in this product for individual use at no charge. (If you wish
to license the Z39.50 protocol stack code for use in a commerial product,
please send me e-mail and I'll get you in touch with Ameritech.) Except for
the Z39.50 protocol software, all source code is being placed in the public
domain. 

You can fetch the executable software, README.TXT, and source files by
anonymous FTP to host www.usgs.gov, in the directory /gils/ciir/dtic_a02.

If there is a lot of interest, we will set up a listserve for discussing
this and related client software. If you would be interested in joining such
a list, or have any other comments about the client software, please send
e-mail directly to me for now.




Eliot Christian, US Geological Survey, 802 National Center, Reston VA 22092
echristi@usgs.gov  Office 703-648-7245  FAX 703-648-7069  Home 703-476-6134