- From: Eliot Christian <echristi@usgs.gov>
- Date: Tue, 06 Feb 1996 16:11:19 -0500
- To: gils@cni.org
In the context of encouraging information providers to support Z39.50 in addition to HTTP, I am often asked to address the question: How does Z39.50 fit into and improve what's on the World Wide Web today? I am especialy interested in making clear the advantages of Z39.50 support from the perspective of commercial information services. I'd appreciate any thoughts you might have on the following rough draft. Also, please do feel free to pass this on to other people who you think may have some thoughts on this matter. Since this may have been forwarded to you, please send your response directly to me <echristi@usgs.gov>. Thanks! ----------------------------------------------------------- What is the proposal for fitting Z39.50 into the Web? Most activity in the World Wide Web today is centered on Web browsers gaining access to information resources on servers through the Hypertext Transfer Protocol (HTTP). Just as the same resource is often made available at the server through multiple protocols such as HTTP, gopher and FTP, this proposal is to make the resource searchable at the server end by adding support for the Z39.50 protocol. (More ubiquitous Z39.50 client software for agents and end users, as through Java or other mechanisms, is addressed separately.) In essence, Z39.50 provides a common computer-to-computer search protocol between diverse information resources and diverse information access mechanisms. A range of software to implement Z39.50 in this way is available, from freeware to various commercial offerings worldwide. Because Z39.50 does not dictate the way information is managed at the server end, providers can support various data and information management approaches yet make all the information commonly searchable. Because Z39.50 does not dictate how information is presented at the client end, intelligent software agents are enabled and user interfaces can be customized (in hardware, software, language, sophistication, graphical design, etc.) for each particular market. In developing a new collection of information for a particular market, a provider can search the contents of other resources via Z39.50 and create pointers to just those portions most relevant to their specific market. If the provider also adds z39.50 support onto the new collection, the resource gains exposure to seekers of information outside of the targeted audience. How does Z39.50 improve the Web? 1. Different players have a common problem Content Seekers sometimes want to include many disparate sources of information in their searching--not just Web pages, not just the resources of one provider, not just things in the English language, and not just snippets of ASCII text. Better search mechanisms are desperately needed due to the sheer size and diversity of information that people would like to take into account. The Internet has huge amounts of content itself and increasingly acts as a pointer mechanism to the vast information stores of off-line media. However, just as in libraries centuries ago, the Internet has incredible diversity of content but lacks basic agreements on how to tag information objects so they can be found. Content Owners want their products to be found by all potentially interested seekers. Today, the only recourse is to somehow acquire advertising space from all of the intermediaries (e.g., "I'll pay you to point to my page from your page"). Intermediaries must support non-exclusive distribution arrangements and are finding new roles as brokers connecting particular groups of seekers to the best sources for their needs. Research and development efforts in advanced information discovery need a common protocol for interoperability to deploy next generation solutions. 2. The client-server model is crucial for progress Server-based searching is inherently limited. If searching is done at the server, the server designer must package the search for the particular target audience (e.g., what information is included, what language(s) does the user know, is the search simplistic or robust). Particular servers can only be comprehensive for their narrowly defined target audience, because they only provide a "packaged view" of the content. So, to reach seekers outside of the narrow-cast, the content must be exposed to unanticipated searching. Intelligent software agents will become increasingly important acting as gatherers of information tailored to very specific interests. Designers of software agents, such as Web crawlers, are frustrated by presentation protocols because the agent has no human driver to interpret the wide variations among packaged information. Consequently, Web crawlers can only deal with bits and pieces of Internet content that happen to be in text form. And, Web crawlers cannot handle content behind interface programs (e.g., CGI scripts, Java applets, database access or search forms, etc.) Lacking distributed search mechanisms, the crawler is also constrained to find only those pages that happen to have a unbroken trail of links back to the starting points. Support of a search protocol with client software allows for next generation software agents. These intelligent agents will characterize the content of information sources and perform distributed searches for those who need periodic updating of volatile information. 3. Z39.50 is the strategic choice for client-server search. Z39.50 is already adopted widely to provide access to important classes of information, including: existing bibliographic catalogs for libraries, museums, and archives worldwide; government information at the national level in several countries and increasingly at the state and other government levels; environmental information at all levels in the U.S. and internationally; all kinds of geo-referenced (map) data and information. Hundreds of resources representing information valued in the tens of billions of dollars are already freely accessible through Z39.50--more is available on a fee basis. There are also hundreds of Z39.50 WAIS databases available, and thousands more WAIS databases are maintained behind HTTP servers. (Unfortunately, most Web browsers are not enabled to handle the WAIS Z39.50 protocol directly as search clients.) Increasingly important to address global markets, Z39.50 incorporates the agreed international standards to address multi-language support. Z39.50 can also be expected to provide a path toward the handling of information search at the semantic level, to finally fulfill the goal of finding data and information based on what its content actually means rather that just the text in which it is represented. The Z39.50 protocol has also demonstrated extensibility to support search based on generalized pattern-matching techniques. These techniques will be increasingly important for finding abstract information such as chemical configurations, gene sequences, fingerprints, faces, video imagery, and numeric trend data. The Z39.50 protocol is implemented on OSI networks as well as TCP/IP, and its implementation is defined through the Abstract Syntax Notation to enhance interoperability. As a binary protocol exchanging data structures rather than merely passing commands, Z39.50 is relatively more secure than other Internet protocols. In addition to free software for Z39.50 servers, there are freeware and commercial implementations of gateways to resources such as X.500 and SQL databases, as well as to HTTP. The Z39.50 standard is extensive in specifying how optional features can be implemented, though it also allows for quite simplistic implementations. By requiring a subset of features in specific implementation contexts, the Z39.50 Profiles greatly improve interoperability and simplify server implementation. Clients can be optimized for access to Z39.50 servers supporting a specific profile yet still enjoy basic search capability on all other Z39.50 servers. Though already quite sophisticated, the base Z39.50 standard and focused profiles are evolving ever greater power through an effective international standards process with full involvement of dozens of major corporate implementors, tied to ISO and IETF, and connected with very active research at dozens of major universities and programs of national governments worldwide. ----------------------------------------------------------- Eliot Christian, US Geological Survey, 802 National Center, Reston VA 22092 echristi@usgs.gov Office 703-648-7245 FAX 703-648-7069 Home 703-476-6134
Received on Tuesday, 6 February 1996 16:18:07 UTC