URI-protocol mapping (was Re: How to add new "protocols" ?)

Daniel LaLiberte (liberte@ncsa.uiuc.edu)
Thu, 20 Feb 1997 10:31:33 -0600 (CST)


From: Daniel LaLiberte <liberte@ncsa.uiuc.edu>
Date: Thu, 20 Feb 1997 10:31:33 -0600 (CST)
Message-Id: <199702201631.KAA22433@void.ncsa.uiuc.edu>
To: touch@isi.edu
Cc: luigi@labinfo.iet.unipi.it, ses@tipper.oit.unc.edu,
Subject: URI-protocol mapping (was Re: How to add new "protocols" ?)
In-Reply-To: <199702191854.AA00320@ash-s.isi.edu>

Simon (and others) made a point that was almost lost in the divergence
into efficiency considerations.  The point is that there is a choice
that *can* be made in how to resolve any URI (not just http URLs).
Each client can make this choice and local or remote servers can help
it make that choice.  The choice might be made on the basis of
efficiency, security, reliability, or any number of other factors.

To be blunt, the strong binding of "http" URLs to the HTTP protocol
is an illusion.  It is merely an association.  It can only be that.
What makes it *appear* to be a strong binding is that for the most
part HTTP over TCP is, in fact, used. 

touch@isi.edu writes:
 > > From http-wg-request@cuckoo.hpl.hp.com Wed Feb 19 10:26:06 1997
 > > Resent-Date: Wed, 19 Feb 1997 13:22:24 -0500 (EST)
 > > From: Luigi Rizzo <luigi@labinfo.iet.unipi.it>
 > > > 
 > > > What is the advantage to a transparent selection of transport
 > > > protocol, given these constraints??
 > > 
 > > Flexibility in switching transport protocols without having to rewrite
 > > the Web matherial.

 > > 	Luigi
 > 
 > This is a great argument for URNs.
 > 
 > Not necessarily for overloading the semantics of URLs.

Well, the question is, how troubling is the overloading?  Is there
a known process for discovering what semantics is intended, or is
everyone left to guess on their own?

 > The problem is that you have others referencing you, but you
 > don't know what protocol to use.
 > 
 > That requires a protocol in itself, to discover the transport
 > mechanism.

You are quite right.  This point is often missed.

The protocol to be used for any URI scheme (URL or URN) can
be hard-wired into the client, or it can be discovered via some
other protocol.

e.g. There have been a few suggestions in the past to extend the
normal resolution of http URLs to ask, via DNS, for the current
location of a server.

Here is a more generalized list of possible ways to discover the
semantics of a URI, starting local to the client and moving out
to remote services.

1. Look up the URI in an in-memory cache.  
   This apears to avoid any transport protocol, but consider that
   the client might be a distributed process itself, and its in-memory
   cache might live in another process with no shared memory.

2. Look up the URI in local or remote cache services.  These are
   mostly HTTP services so far, but note that the protocol for talking
   to an HTTP cache services is slightly different from how you talk
   to other HTTP servers.  Also, the cache service itself might have
   other protocols that it uses to talk with other cache services.

   Note also that the URI is not necessarily treated opaquely by cache
   services.  The known structure of the URIs may be used to structure
   the lookup process.

3. Look up the URI in a local table that maps a prefix of the URI to
   some protocol or process.  The table might say "use the file
   system" for local URIs, or "use afs" for a known ftp branch, or
   "use the NAPTR mechanism" for those old http://purl URLs.  A
   trivial case is "use HTTP [perhaps via a proxy] to the named
   server:port" for other http URLs - this is what is typically done
   that only appears to be a strong binding.

4. If the URI has a DNS component, lookup the domain name (in local
   name servers first) for service mappings.  This is similar to the
   local table, but may be distributed via DNS - all of DNS becomes
   a service lookup table.
   
5. Ask the named service (discovered by any of the above) to resolve a
   URI and learn that some other service or protocol should be used.
   This is not the same as getting a redirection to another URI,
   described next.

6. Get a redirection to another URI or a collection of alternative URIs.
   Make a choice or follow the redirection thus repeating the above
   process.  This redirection/indirection is just another way to map
   one URI to another, one protocol to another.

--
Daniel LaLiberte (liberte@ncsa.uiuc.edu)
National Center for Supercomputing Applications
http://union.ncsa.uiuc.edu/~liberte/