Message-Id: <9112051816.AA29899@pixel.convex.com> To: wais-talk@think.com Cc: tcl@allspice.berkeley.edu, www-talk@nxoc01.cern.ch Subject: documents, files, types, and access methods Date: Thu, 05 Dec 91 12:16:11 CST From: connolly@pixel.convex.com Someone mentioned that WAIS should obviate the need for FTP. I disagree. I think that the WAIS protocol is good for finding documents, but not necesarily for transferring or displaying them. There are two scenarios that WAIS is good for: A. The database is built for wais. For example, DowQuest. That database is stored so that it can be efficiently acessed and delivered through WAIS. In this case, it makes sense to transfer the contents of the documents through WAIS and to use the nifty chunking ideas. B. The database is built for system X, and somebody sicked waisindex on it. This is currently, by far the most common case. Look at all the USENET archives, biology databases, library catalogs, etc. that weren't designed for use with WAIS, but they work pretty well. In this case, it makes more sense to me to transfer and/or present the documents using the clients that the database was designed for. The WAIS server should send enough information to retrieve and/or display the document using the other client. Example: the archie database. As a user, I want to query the archie database using WAIS's fulltext and relevance feedback queries, but I want to retrieve the documents with FTP, and I may want to "present" them with uncompress and tar, or lpr, or ghostscript, etc. Example: USENET news. I want to query using WAIS, but read it with my news reader. Example: my mail box. Query with wais, display with Xmh, Elm, mh, emacs, etc. Retrieving the whole document with WAIS and saving it to a file is no good in this day and age of client-server computing. The WAIS client may be on a machine with no disk space to spare. And I may want to use the file on a different host. So we see that the WAIS client needs to hand off documents to other clients. This raises the question: what information should the WAIS search client pass to the retrieval/display slave clients, and how? The CNI-ARCH folks are discussing a standard for document identifiers. I think this is definitely one of the things that WAIS should pass, but it's not the only thing. I'm beginning to look at documents sort of like records in a relational database. The WAIS client should negociate with the slave client what fields they have or are interested in. An obvious representation for these records is the RFC-822 mail message format. Example: the archie database. I use my xwais client to query archie.src on "vgrind." My xwais client gets a list of docids from the WAIS server. These docids contain at least the score and the CNI-ARCH style docid, which in this case would be enough info to construct a prospero file handle [I'm not sure there is such a thing as a prospero file handle, but play along anyway...]. I play gui-games with xwais until I get the list of documents that I like. Then, using some mechanism like the X selection mechanism or drag-and-drop (combined with SMTP, perhaps), I select a document and give it to my xftp application. The xwais client and the xftp client agreed earlier that they would send messages like: From: xwais@x.server.host To: xftp@x.server.host CNI-ARCH-ID: <12345@prospero:quiche.cs.mcgill.ca> SIZE-IN-BYTES: 120034 FTP-HOST: export.lcs.mit.edu FTP-USER: anonymous FTP-CD: pub/util FTP-GET: vgrind.tar.Z blah blah blah about vgrind, perhaps explaining what query found this file, or perhaps some stuff from the README in vgrind.tar.Z . I have already played gui-games with xftp to tell it where to put the files it retrieves. When it gets this message, it does the HOST, USER, CD, and GET commands, and presto! I've got my document. I think if we had a suite of these gui tools talking SMTP to each other, they could get a lot of work done. More examples: To: xtar@x.server.host fopen: /home/connolly/vgrind.tar or perhaps popen: zcat /home/connolly/vgrind.tar.Z xtar has a gui for selecting a place to extract the archive To: xlpr@x.server.host fopen: /home/connolly/vgrind-2.1/manual.ps or popen: zcat /home/connolly/vgrind-2.1/manul.ps.Z | xlpr selects destination printer, copies, etc. Most tools fit in naturally. The $PAGER and $EDITOR, and perhaps $SHELL tools could be MUCH more powerful if they could interoperate this way. [Has anybody used mx and tx from John Osterhout(sp?) ? Those and the Tk toolkit allow X applications to send commands back and forth.] For example, the World-Wide-Web browser would fit the role of $PAGER in this environment. It would receive messages to display WWW nodes, containing their HTTP address (or NNTP, FTP, etc.). It would then display the node and allow the user to scroll around and choose anchors etc. It could handle most anchors by itself, but it might want to let the user select a region of text and send it to the WAIS client. I don't think there's an $EDITOR that fits very well, though emacs is always a contender, and you have to have vi. [I think the mouse support in emacs needs a LOT of work, but I probably haven't seen the latest and greatest stuff.] I'm not sure how $SHELL fits into all this but, for example, folks send shell commands in mail messages to each other all the time. You could just select the shell command in your mail $PAGER, and drag it to your $SHELL x-client for invocation. I hope I get time to try to implement a couple of these ideas. Then we can all see whether they're worth persuing. Dan