Re: URC proposal for Davenport Group

Terry Allen (terry@ora.com)
Sun, 22 Jan 1995 17:09:16 PST


Message-Id: <199501230109.RAA07091@rock>
From: Terry Allen <terry@ora.com>
Date: Sun, 22 Jan 1995 17:09:16 PST
In-Reply-To: "Daniel W. Connolly" <connolly@hal.com>
To: "Daniel W. Connolly" <connolly@hal.com>, Terry Allen <terry@ora.com>
Subject: Re: URC proposal for Davenport Group
Cc: davenport@ora.com, uri@bunyip.com, hackers@ora.com

Thanks for a quick response, Dan.

| Before you get too far into the details of data formats,
| components, and system design, could you give a brief
| description of the features you need that are not
| available though currently deployed technology?

Hadn't thought I'd given any details except for the URC DTD.

A feature I need not presently available in applications 
is the ability to resolve abstract names to addresses automatically
by means of very constrained queries against a bibliographic database
*and* inititate appropriate browser activity without human intervention.
Where is that deployed now?  I also want an open architecture, so
I have implementation choices for each piece, rather than having to
buy the whole thing as a package from a single vendor.

| For example:
| 
| In message <199501222041.MAA02978@rock>, Terry Allen writes:
| >
| >Then it has to be wrapped up and made to work so that I can write a link
| >in my Docbook document (Docbook has a Ulink element to hold
| >URLs) like this, for a URN:
| >   ... blarty foo <ulink url="the.urn.goes.here">Windows 3.1
| >	User's Guide</ulink> blarts more blarts
| >
| >or this, for a URC title query:
| >
| >   ... blarty foo <ulink url="the.urc.for.title.goes.here">Windows 3.1
| >	User's Guide</ulink> blarts more blarts
| >
| >and when the user clicks on the hot spot, the intended document
| >is fetched and displayed from the local installation or from the
| >Internet, assuming the user is connected to it.  It may be
| >desirable to extend <ulink> with attributes additional to 
| >the present URL attribute.
| >
| >The browser has to transmit the URN or URC to the local URC 
| >resolution service, then if need be to the publisher's URC
| >resolution service site, and upon receipt of the response,
| >to invoke the "some machinery" to pick a URL and fetch it.
| 
| I see no features in the above scenario that are not
| currently available. Just write:
| 
| 	<ulink uri="http://www.microsoft.com/windows3.1/userguide.html">
| 	Windows 3.1 User's Guide</ulink>
| and run a chaching HTTP server. The client will consult the cache, and

Then caching has to be reinterpreted as "knowing about all the SGML
entities for which files are installed on the local system," not just 
"knowing about recently fetched and still valid entities."

| if it's not available, the caching server will go to microsoft.com and
| get userguide.html, which is a "cover page" (URC, surrogate record,
| header, card catalog, abstract, call it what you like) encoded in HTML
| for universal access. The "cover page" has the Title, author, perhaps

I certainly don't want to encode URCs in HTML to begin with, and I don't
see the value of presenting the info in HTML except when human
disambiguation is necessary.  I'm trying to constrain this project
to situations that won't require human disambiguation, so as to throw
more light on the other issues, and because the domain of application
may allow it (as is not the case in general).

| an abstract, 

(just to keep saying it, abstracts and SOAPs don't belong in metadata.
Back to the discussion.)

| and links to the document in, say, postscript, PDF, and
| RTF format, along with ordering information for hardcopy (or alink to
| such ordering info). The user chooses which URL to follow manually --
| I haven't seen any successful techniques for automating this step.

Automation is exactly what I want to achieve.  I think the required
technique is to query an appropriately constructed bibliographic
database.  For the purpose at hand, I assume all the docs are in 
SGML, not some choice of formats.

| As an excellent example of this sort of thing in action, see:
| 
| 	http://www.research.digital.com/SRC/publications/src-rr.html
| 	(a bunch of citations leading to abstract pages which
| 	lead to postscript reprints)

which is a good use of one or several of the technologies required
here, but not at all what I'm aiming at on the application level.  

[ ... ]

| Some folks _are_ going beyond HTML, HTTP and forms:
| 
| 	http://rd.cs.colorado.edu/harvest/
| 	Harvest. They propose a new protocol for bulk distribution
| 	of "SOIF" records -- structured object information format;
| 	There's lots of neat technology here. Have a look if you
| 	want to do resource discovery on a large scale.

I like the Harvest work, but what I want to do here is link *any*
such setup to SGML docs so that computer documentation can make
cross references across docsets and even publisher libraries as
seamlessly as it can make a cross reference from one paragraph to
the immediately succeeding one.  

I sum, I don't know of any SGML application (seeing the process 
from the point of view of the Ulink) that can resolve bibliographic
references over the Internet in a nonproprietary way.  And I haven't 
yet heard of a browser that will make a choice among a list of URLs 
returned from a query.

I agree that the technology exists to achieve what I want; this 
project is meant to explore what technology would work well and
what behavior is desired in the system overall.  It may be that the 
only piece that needs specifying except in prose is the URC format.  

-- 
Terry Allen  (terry@ora.com)   O'Reilly & Associates, Inc.
Editor, Digital Media Group    101 Morris St.
			       Sebastopol, Calif., 95472
A Davenport Group sponsor.  For information on the Davenport 
  Group see ftp://ftp.ora.com/pub/davenport/README.html
	or  http://www.ora.com/davenport/README.html