URC spec 2/6

Ronald E. Daniel (rdaniel@acl.lanl.gov)
Fri, 9 Jun 1995 06:58:16 -0600


From: "Ronald E. Daniel" <rdaniel@acl.lanl.gov>
Date: Fri, 9 Jun 1995 06:58:16 -0600
Message-Id: <199506091258.GAA20121@idaknow.acl.lanl.gov>
To: uri@bunyip.com
Subject: URC spec 2/6


INTERNET-DRAFT          An SGML-based URC Service         June 7, 1995

1 Introduction


Experience with the WWW  has exposed the  problems inherent in  basing
the  system  on  resource  locations  instead  of  resource  identity.
Uniform Resource Locators (URLs) typically identify a  particular path
on a particular host.  This  leads to a bevy of problems  with network
hotspots, fault-tolerance,  and  resource  management.    To  overcome
those problems, the Uniform Resource Identifiers Working Group  of the
Internet Engineering Task  Force has  been developing an  architecture
that uses Uniform Resource  Names (URNs) for resource  identification.
An name resolution service would  handle the problem of mapping  names
to locations for the purpose of  retrieval.  The data  structures that
contain the information  necessary for  this resolution  are known  as
Uniform Resource Characteristics (URCs), and the resolution service is
known as the URC service.

Several scenarios  of  how  this  service  would  be  used,   and  the
requirements they  place  on  the  service,  were set  forth  in  [1].
The primary  purpose  of  the  URC  service  is  to  resolve  URNs  to
URLs.  However,  the URC makes  too good a  place to store  additional
information about the  resource to  pass up  the opportunity.   It  is
easy to  imagine  storing  basic bibliographic  information,  such  as
author, title, and subject, in  order to provide the foundation  for a
"card catalog" service for Internet-accessible documents.   Of course,
there is  no reason  to stop  with documents.    Scientific  datasets,
product databases, computer-generated music, etc.  are  all reasonable
candidates for publication over the WWW. The more one looks at the URC
service, the more one realizes  just how great a range  of information
it could reasonably  provide.   This leads  us to  looking at the  URC
service as a general service  for presenting metadata - or  data about
data.  Because of the wide variety of data that can  be made available
over the Internet,  and because of  the diversity  of the metadata  we
might want to use to describe it, no single set of attributes (such as
author, title, subject) are universally applicable.  This argues for a
very general means of  specifying attribute sets.   At the same  time,
recall that the primary purpose of  the URC service is for URN  to URN
resolution.  This argues for  a single, easily parsed,  attribute set.
Other apparently conflicting requirements were set forth in [1].

This proposed specification  attempts to  reconcile these  conflicting
demands.  The need for a  formal method is met by using  SGML Document
Type Definitions  (DTDs) to  specify the  structure  of new  attribute
sets.  This is  described in section 3.   Simple changes to  attribute
sets can be accommodated through a single-inheritance  mechanisms that
is also  described  in  section 3.     The need  for  fast,  heuristic
parsing is met by  providing a particular DTD  that is believed to  be
widely, though not universally, applicable.  The resolution process is
reviewed in section 2, while the default attribute set is described in
section 4.   The specification allows for  user agents to request  URC
information in different  transfer syntaxes in  order to ease  parsing


Ron Daniel                                                    [Page 4]


INTERNET-DRAFT          An SGML-based URC Service         June 7, 1995

or provide particular capabilities, such  as digital signatures.   The
ability for using multiple syntaxes  is described in section 5,  which
also describes particular transfer syntaxes that are to be regarded as
``well known'' and  must be  supported by  all URC servers.    Another
important part  of the  service  is the  means  that it  provides  for
queries.  The specification allows for multiple query languages.  This
part of  the spec  is described  in section  6,  which also  describes
the trivial query  language that all  URC servers must  support.   How
the specification meets  the requirements  established in  [1] is  the
subject for section 7, while section 8 discusses issues that are still
unresolved at this time.

This is the first draft  of the specification,  and it is known to  be
incomplete.  It makes no  attempt to discuss how URC  information will
be stored at  a server,  and does  not address  issues of  maintaining
URCs, distributing the database for fault-tolerance, etc.



2 URN Resolution Overview


A variety of URN syntaxes and resolution procedures are  being studied
by the URI-WG.  This spec  assumes a syntax  and resolution  procedure
roughly like  that in  [2].   Briefly,  such  a URN  contains a  Fully
Qualified Domain Name (FQDN), which  identifies a set of servers  that
are authorized  by  the publisher  to  resolve the  publisher's  URNs.
(These are known  as default  URNs).   The  client sends  an HTTP  GET
request for the  complete URN  to that server.    The request may  use
HTTP's Accept:   header  to indicate  preferences for  the results  to
be returned in  particular syntaxes.   The result  is returned to  the
browser.  Depending on  the transfer syntax and browser  capabilities,
the browser may choose one of several URLs itself, it may hand the URC
off to an  external application that  can make  the selection, or  the
browser may display  the URC  to the  user so  the user  can make  the
selection.

This specification uses HTTP as the resolution protocol.   Use is made
of HTTP's format negotiation capabilities.  Using HTTP should ease the
transition to more secure resolvers,  which is a requirement,  because
of S-HTTP, SSL,  and similar security  efforts.   Furthermore, a  wide
variety of browsers, servers,  tools, and expertise already exist  for
HTTP and can quickly be brought to bear on the URC service.