Uniform Resource Agents -- a draft proposal from Leslie Daigle on 1995-03-21 (uri@w3.org from March 1995)

From: Leslie Daigle <leslie@beethoven.bunyip.com>
Date: Tue, 21 Mar 95 14:43:17 -0500
To: uri@bunyip.com
Message-Id: <9503211943.AA18545@beethoven.bunyip.com>
Hello,

Here's the document I promised last week.  I've submitted it as an
Internet Draft; don't know if that will actually be able to happen in
time for the April IETF.

Cheers!
Leslie.

=====================

Submitted as:

IETF URI Working Group                                     Leslie Daigle
Internet-Draft                                             Peter Deutsch
Proposed file: draft-ietf-uri-ura-00.txt                     Bill Heelan
Expires September 26, 1995                                 Chris Alpaugh
                                                         Mary Maclachlan
                                         Bunyip Information Systems, Inc
                                                           21 March 1995



			Uniform Resource Agents (URA's)


----------------------------------------
Abstract
----------------------------------------

This paper proposes  Uniform Resource Agents (URA's) as a means of 
specifying composite net-access tasks.  Tasks are described as "composite" 
if they require the construction and instantiation of one or more Uniform 
Resource Locators (URL's) or Uniform Resource Names (URN's),   and/or if 
they require transformation of information returned from instantiating 
URL's/URN's.

The paper presents the underlying concepts of URA's, proposes an 
architecture, and introduces a prototype application that has been built 
following the general principles of these URA's.


----------------------------------------
Status of this Memo
----------------------------------------

	This document is an Internet-Draft.  Internet-Drafts are working 
	documents of the Internet Engineering Task Force (IETF), its areas, 
	and its working groups.  Note that other groups may also distribute 
	working documents as Internet-Drafts.

	Internet-Drafts are draft documents valid for a maximum of six 
	months.  Internet-Drafts may be updated, replaced, or obsoleted by 
	other documents at any time.  It is not appropriate to use 
	Internet-Drafts as reference material or to cite them other than 
	as a "working draft" or "work in progrss."

	To learn the current status of any Internte-Draft, please check the 
	1id-abstrats.txt listing contained in the Internet-Drafts Shadow 
	Directories on ds.internic.net, nic.nordu.net, venera.isi.edu, or 
	munnari.oz.au.



----------------------------------------
Acknowledgements
----------------------------------------

Several people have shared thoughts and viewpoints that have helped shape
the thinking behind this work over the past few years.  We'd like to thank,
in particular, Chris Weider, Patrik Faltstrom, Michael Mealling, Alan 
Emtage, and the participants in the IETF URI Working Group for  many
thought-provoking discussions.



----------------------------------------
Overview of This Document
----------------------------------------

The rest of this paper has been divided into the following sections:

        Introduction: An overview of the concepts leading up to the proposal 
                of URA's
        Design Considerations
        Scenarios for Suggested Usage: Concrete applications for URA's
        Overview of Perceived Functionality
        Overview of Proposed Architecture
        Transmission of URA's
        Above, Beyond, and Over the Top:  Future possibilities for URA's
        Appendix -- Using URA's Today:  A prototype application designed on
                URA principles
        Appendix -- A Sample Search URA


----------------------------------------
Introduction
----------------------------------------

Interaction with Internet resources can be categorized in four levels of
increasing complexity:

                . resource access
                . resource location
                . resource discovery
                . resource management

Resource access is supported by Internet standard protocols (e.g., http,
ftp, nntp, etc).  The resource location problem is addressed by work in
URL and URN technologies: URL's describe specific instances of a resource 
on the network while URN's identify an individual  resource without 
distinguishing between specific instances of it.    Resource discovery is 
largely supported by a collection of indexing services (e.g., Lycos, 
Archie(tm), Veronica, etc) and a large component of serendipity.  While 
these services may aid in the discovery of resources, they do little to 
support the user in the re-discovery of resources at a later date -- 
resource management.

>From a user's perspective, Internet resource management includes the 
ability to specify and repeat an activity composed of one or more "simple" 
Internet actions.  At present, this requires users to know the protocol 
and particulars of instantiation for URL's.  However, typical resource 
needs span several different URL types -- how can a heterogeneous 
collection be brought together (i.e., documents/resources on a common 
thread of content, not identical protocols)?

Enter the URA -- Uniform Resource Agent.  It allows the encapsulation of
network resource particulars, so that users (and user programs) can
specify requests at a high level.  These are specified as objects, with
a script component,  to provide a powerful yet manipulable construct.

The proposed objects are called "agents" because they are intended to 
perform some action on behalf of an invoker.  Nonetheless, there is a
fixed underlying model to the types of activity that can be undertaken
by a URA.  Namely, they combine content-specific data from the user with a set
of known Internet resources to carry out a high level activity.  The URA's 
perform the invoker's task by communicating with net resources through 
known protocols and with appropriate verification and authentication.

The simplicity of this model belies the sophistication of agents that
can be built to conform to it.  URA's can contain elaborate scripts to
orchestrate the instantiation of the component URL's and then to filter
the URL results into the material to be returned by the URA.  URA's
can be used as building blocks to create larger information systems.  

It is worth noting that the model calls for URA execution to occur 
in the invoker's work space (i.e., workstation, or server, etc).  This
is a distinction from "intelligent agents", or "knowbots" that are
designed to roam the net and activate themselves in remote data spaces.
URA's are activated locally and communicate with other machines through
standard protocols.



----------------------------------------
Design Considerations
----------------------------------------

The above paragraphs describe the basic underlying model of the URA's
activity.  Design questions remain: how can we define URA objects in a 
general enough way that they can actually support the wide variety of 
possible activities (even when constrained by the use of URL/URN accesses)?
And, given that URA's are objects, _what_ causes them to become active?

The proposed answer to the former question is to have _typed_ URA's.  While
all URA objects should conform to the overall object architecture described
below, it seems necessary to allow the ability to refine the definition
of an object for specific tasks -- i.e., typed URA's.  This will be 
discussed in greater detail below.

The underlying purpose of the URA object is to capture the specification 
of an Internet activity.  As such, URA's have a formal structure.    
However, URA's are not free-running scripts, and thus need an environment 
in which to be activated.  The proposal is to define a URAgency to 
schedule and  control the execution of actions specified in the URA object.  
This URAgency would reside either on the invoker's machine, or remotely 
(with attendant security/authentication issues).  With the net-accesses 
centralized to this agency, some issues of security can be addressed.  


----------------------------------------
Scenarios for Suggested Usage
----------------------------------------

Searching:

Resource discovery covers the broad spectrum of tools available for 
carrying out searches on information stored across the Internet.  Most 
users think of their information needs in terms of the resource sought, 
not the services and protocols available  to access material on the net.  
Thus, a search URA can be constructed to embody the information necessary 
to access a set of Internet resource indexes.  For example, an FAQ search 
URA might be tailored to access FAQ indexes, web sites known to index 
FAQ's, and do an archie anonFTP search to find all files ending in .FAQ.  
The results returned from these searches can then be formatted/filtered 
for viewing by the user.


Publishing:

Assuming a net-wide publishing mechanism, a URA could be constructed
that encapsulates the necessary information to contact a publishing site 
and register a net document.  Thus, all the user would have to provide 
is the document and personal details necessary for identification -- the 
mechanics of the interaction can be handled through a URA.



----------------------------------------
Overview of Perceived Functionality
----------------------------------------

There are two main types of URA users that are referred to in this
document:

         1. CREATORS - These are people who build (create) URA's and
possibly distribute them across the Internet.  Creators are fairly
experienced Internet users who will have some knowledge of assorted
Internet protocols (ftp, Telnet, gopher, etc) and some understanding
of how these protocols are used (URL's).

         2. INVOKERS - These are people (or programs) that use existing 
URA's to perform actions across the Internet.  Invokers do not have to 
know anything about protocols of any kind and need little experience on 
the Internet.  

For invokers, the URA types provide an "API to the Internet".  Each URA
type specifies what type of information it requires of the invoker, and 
returns results in a predefined format.  The basic functionality an invoker 
can expect from a URA system is as follows:

	. to be able to pick a URA to be activated (i.e., identify an
	  activity to be carried out)
	. to specify whatever run-time information is needed by the URA
	. to cause that URA to be activated
	. to find out what results were obtained by activating that URA
	  (e.g., error reports, results in a predefined format, etc).

Thus, the invoker may require some or all of the following from the URA system:

	. list of URA types handled by a URAgency
	. list of available URA's (of a given type)
	. identifying (creator/author)  information about a URA object
	. list of content information required by the URA (e.g., what
	  to search for)
	. list of URL/URN's to be accessed by the URA (this tweaking ability
	  is necessary for some run-time control of the URA)
	. the URA object itself (to save it locally, etc)
	. the results of activating a URA (results are defined by URA type)
	. URA activation status/error reports






----------------------------------------
Overview of Proposed Architecture
----------------------------------------


The proposed object structure to support this functionality is based
on 5 parts:

ID:
	Identification of the URA object, including a URA name, type and 
	abstract, creator name, resources required by the URA, etc.

WHAT:
	Specification of the data elements required to carry out the URA
	activity.  For example, in the case of an Internet search for
	"people", this could include specification of fields for person 
	name, organization, e-mail address, etc.

WHERE:
	Specification of the URL/URN's to be accessed to carry out the
	activity.  Note that, until URN's are in common use, the ability
	to tweak URL's will be necessary.  A key issue for URA's is the
	ability to transport them and activate them far from the creator's
	originating site.  This may have implications in terms of 
	accessibility of resource sites.  For example, a software search 
	created in Canada will likely access a Canadian Archie server, and 
	North American ftp sites.  However,  an invoker in Australia should 
	not be obliged to edit the URA object in order to render it 
	relevant in Australia.  The creator, then, can use this section to 
	specify the expected type of service, with variables for the parts 
	that can be modified in context (e.g., the host name for an Archie 
	server, or a mirror ftp site).

	At the very least, an invoker may wish to evaluate what URL/URN's
	are to be instantiated as part of the decision process for launching
	a URL.

HOW:
	If URA's were strictly data objects, specifying required data and
	URL/URN's would suffice to capture the essence of the composite
	net interaction.  However,  the variability of Internet resource
	accesses and the scope of what URA's could accomplish in the net
	environment seem to suggest the need to give the creator some 
	means of organizing the instantiation of the component URL/URN's.
	Thus, the body of the URA should contain a scripting mechanism
	that minimally allows conditional instantiation of individual
	URL/URN's.  These conditions could be based on which (content)
	data elements the user provided, or accessibility of one URL/URN,
	etc.  It also provides a mechanism for suggesting scheduling of
	URL/URN instantiation.


WHAT-THEN:
	Once the URL/URN's have been instantiated, there will need for 
	some post-processing in order to provide URA results in the
	format required by the URA type.  For example, a URA built to
	deal with list-server mailing list subscriptions might belong to
	a class of net-services URA's that return simple success/failure
	messages.

	The needs of post-processing are likely to be many and varied. The
	current proposal is to let creators specify a full-fledged scripting
	language (e.g., Perl) to give them the maximum flexibility in 
	creating tools to extract necessary information from the results
	of URL instantiations.


A sample URA description based on this architecture is presented in the
second appendix of this document.

All URA types will conform to the same  basic object structure.  The
primary difference will be in result types.  This is nonetheless an 
important distinction from the standpoint of allowing URA-invoking programs 
to prepare for expected return values (e.g., HTML, or ASCII streams, etc).
It does not fall within the boundaries of the responsibility of the URAgency
to normalize these results -- it is the creator of the individual URA
that is expected to know what format of information individual URL/URN's  
will return, and thus create the mechanisms for transforming those results 
into something appropriate for the URA type.

A URAgency may nonetheless perform some post-processing on the data returned 
by the URA.  Once the results have been filtered and prepared in a URA type
format, the URAgency may fold them into a format required by whatever 
software it is geared towards -- HTML page, MIME document, ascii, or a format 
selected by the user.

Also, the URAgency may provide a means for users/controlling programs to
enquire after the status of the URA instantiation -- curtailing the search,
requesting interim results, etc.


----------------------------------------
Transmission of URA's
----------------------------------------


In all the envisioned usages of URA's, the ability to share URA objects
(through publishing, e-mail, or other transmission mechanisms) figures
prominently.  This brings up issues about _what_ should be transmittable.
Is the user limited to passing the URA itself? Or can they send an instance
(a URA with some data filled in already)?  How will the recipient distinguish
between the URA and an instance (if an instance is just another URA).

This suggests the notion of _encapsulated_ URA's.  The object that gets
transmitted is something that contains a URA object specification as
well as instance information (optional) and results from activating the
URA (optional).  A probably implementation mechanism for this would be a
multipart MIME document.


        Encapsulated URA:

        +------------------------+
        | URA Spec: Includes     |
        |     the 5 std parts:   |
        |     URA ID             |
        |     WHAT               |
        |     WHERE              |
        |     HOW                |
        |     WHAT-THEN          |
        +------------------------+
        +------------------------+
        | URA Instance info      |
        | (Values the user has   |
        | already specified)     |
        |                        |
        +------------------------+
        +------------------------+
        | Results                |
        |                        |
        |                        |
        |                        |
        +------------------------+


Users might not want to always send all of this -- where "[]" means optional,
valid sub-parts to send would be as follows:

        [URA Spec [instance info]] [results]

Thus, one could send just results, or just URA Spec, but not instance 
information without the URA.




----------------------------------------
Above, Beyond, and Over the Top
----------------------------------------

Most of the discussion above implicitly assumes that a human user will
provide URA activation data ("what", and optionally "where").  However,
there is nothing in the model that precludes the writing of programs that
will interact with a URAgency to select URA's and provide the necessary 
data for activation.  This widens the scope of possible applications of URA's
considerably (and even suggests the possibility of recursive activation of
URA's).    As more applications necessarily assume access to the Internet
for functioning, URA activation may happen from functions buried deep within
the program code -- an application might make use of a URA to get periodic
updates of documentation, etc.

In an earlier section, it was pointed out that the URA architecture was
designed to address a particular net-activity-specification model, and not
to encompass all conceptions of net-based agents.  Nonetheless, the 
underlying goal in designing the proposed architecture was to leave enough 
flexibility that _any_ task conforming to the "what-where-how" model could 
be supported by a URA type.  Thus, user models, information filtering 
preferences, etc, could be incorporated into URA usage.    For example, 
context-sensitive or natural-language-based search front-ends could convert 
user requests into formats accepted by a set of base URA's, etc.

Thus, the basic premise of the proposed URA scheme is that capturing the 
basics of composite net-activities leaves open the possibility of using 
URA's as the building blocks of more sophisticated systems.



----------------------------------------
Appendix -- Using URA's Today
----------------------------------------

In a project supported by the Canadian Government's CANARIE TD2 program,
Bunyip Information Systems has developed a prototype application based on
these URA principles (a beta version is due to become available in April, 
1995).

This particular application is a desktop client to give Internet users the
ability to select searches in terms of what they are looking for, not what
protocols they need to use.  By putting searches on the desktops of people
that have a vested interest in the information returned,  new possibilities
search customization and creation are opened up.  These searches still make
use of existing Internet resources (indexes, etc), but management of the
search occurs local to the user's work environment (as opposed to, say,
on a particular WWW search server).  This also means that the behaviour of
the search does not change unless the user has re-configured the search.

The basic architecture of the system, from the bottom up, is as follows:

	 Base:  storage and management of search URA's

		This includes maintaining a list of URA's accessible to 
		the user, retrieving all or part of a URA description, 
		accepting activation data and activating individual URA's.  
		The base component is the only part that needs direct 
		access to the Internet.  The results of the activation of 
		a URA are returned in the form of _headlines_:  text 
		description of the resource and the URL to instantiate the 
		resource.

	Desktop Tool:  interface to a library of search URA's

		This provides the user with an interface to the selection,
		instantiation, and activation of search URA's, as well as
		management of results returned by the URA.

	Browsing:  instantiating individual URL's

		The application works with existing popular WWW browsers to
		render the results of instantiating individual URL's returned
		by a search URA.


Key features of the application are that it:

	. focuses on searches at the level of information sought, not
	  invoked protocols
	. allows the user to maintain a collection of URA's (individual 
	  URA's can be swapped between users and stored on a local file 
	  system)
	. allows users to invoke particular URA's and fill in information
	. allows the user to save and reuse partially-instantiated URA's


The main design challenge, now, is to determine what constitutes a _good_
search URA...



----------------------------------------
Appendix -- A Sample Search URA
----------------------------------------


This appendix contains a text-description of the contents of a sample search
URA.  This is an architecture-level description, for discussion purposes --
this is not meant as an implementation proposal.

This example shows a person-search type URA containing two searching
URL's; a "whois" and a "netfind" search.  Thus this URA will attempt
two different kinds of searches for the invoker.  In this case, the
URA supplies the same person's name to both URL's.  

Although this example is not particularly exotic, it does demonstrate the
URA's ability to specify a net-based activity: finding information about 
people.  The user of this URA need not be troubled by specifics of URL 
construction (defaults are provided for all URL constructors), and can 
simply provide details about the information need (person name, etc) -- the 
URA will take care of the protocol-specific details.

ID:

    URAname:	PersonSearch
    Abstract:	This URA  uses whois and netfind services to find
		   information people with names matching your input.
    ScriptLang:	tcl


WHAT:

   Component	Value	Default	Prompt			Required?
   ---------	-----	-------	------			---------
   PERS_NAME	<b>	<b>	Last name of person	Y
   PERS_FIRST	<b>	<b>	First name of person	N
   ORG		<b>	<b>	Organization person 	Y
				  works for


WHERE:

   URL-ID:  a-whois
   Abstract: Generic whois search
   Constructor:  $WHO-PROTOCOL://$WHO-HOST/$PERS_NAME

   Component	Value	Default		Prompt			Required?
   ---------	-----	-------		------			---------
   WHO-PROTOCOL		whois++		Protocol for WHOIS++	Y
   WHO-HOST		bunyip.com	Host running WHOIS++	Y
   
   
   URL-ID: a-netfind
   Abstract:  Generic NETFIND search
   Constructor:  $NET-PROTOCOL $NET-HOST $NET-SELECTOR

   Component	Value	Default			Prompt			Req'ed?
   ---------	-----	-------			------			-------
   NET-PROTOCOL 	telnet			NETFIND protocol	Y
   NET-HOST 		netfind.ee.mcgill.ca	NETFIND server host	Y
   NET-SELECTOR		netfind;;2/Srch; 	NETFIND selector string Y
			  $PERS_NAME $PERS_FIRST
			  $ORG


HOW:
                pers_res = run(a-whois);		       
		if empty(pers_res) then 
		   pers_res = run(a-netfind);	     

WHAT-THEN:
		pers_res = last(pers_res, 10);			  







----------------------------------------
Authors's Address
----------------------------------------

Leslie Daigle
Peter Deutsch
Bill Heelan
Chris Alpaugh
Mary Maclachlan
Bunyip Information Systems, Inc.
310 St. Catherine St. W.
Suite 202
Montreal, Quebec, Canada
H2X 2A1

Phone: (514) 875-8611

EMail: leslie@bunyip.com




------------------------------------------------------------------------------

"Freedom without responsibility                        Leslie Daigle
           is anarchy"                                 leslie@bunyip.com
                  -- ThinkingCat                       Montreal, Canada
 
------------------------------------------------------------------------------
Received on Tuesday, 21 March 1995 14:43:29 UTC