Re: The Path URN Specification

Michael Shapiro (mshapiro@ncsa.uiuc.edu)
Wed, 22 Mar 1995 22:25:54 -0600 (CST)


From: mshapiro@ncsa.uiuc.edu (Michael Shapiro)
Message-Id: <9503230426.AA24312@void.ncsa.uiuc.edu>
Subject: Re: The Path URN Specification
To: Michael.Mealling@oit.gatech.edu (Michael Mealling)
Date: Wed, 22 Mar 1995 22:25:54 -0600 (CST)
In-Reply-To: <199503220347.WAA26944@oit.gatech.edu> from "Michael Mealling" at Mar 21, 95 10:46:43 pm

Michael Mealling wrote:
|
|Daniel LaLiberte said this:
|>     o If the TXT record is missing, then the URN does not resolve
|>       into a server and the URN is assumed to be invalid. 
|
|Some of my concerns with this approach are the same ones I have with Mitra's
|approach. That concern is that by tying this so closely with DNS we are
|excluding a very very large portion of our user base from becoming involved.
|The power behind the web was that anyone could set one up without going 
|through any official channels. I know that if I had needed to ask for
|a TXT entry in our nameserver (which we don't do and probably won't do) in
|order to put up a http server that we would just now be getting one setup.
|
|In order for URNs to be used there must be no barriers to the system being
|setup by anyone with an IP address. I would argue that scalability 
|inherently depends on this or it will fail. I.e. URNs must be as easy to
|setup and manage as URLs or folx won't use them. They often don't mind 
|setting up additional software but DNS mucking is usually right out...

I'm not sure how to answer this. You may be right and you may be
wrong.  If users find the features of the path scheme useful/desireable
the DNS barrier may not seem insurmountable.  How do we test/confirm
your hypothesis?

|
|> To clarify the above algorithm, some examples are presented. The
|> examples use the partial document tree specified previously. The DNS
|> entries for this partial tree are: 
|> 
|>                               TXT           A
|>              a.path.urn     -empty-       -none-
|>           b1.a.path.urn    c2, port=n    ip-address
|>        c2.b1.a.path.urn        port=n    ip-address
|>           b2.a.path.urn   d.c, port=n    ip-address
|>       d.c.b2.a.path.urn        port=n    ip-address
|
|The concern I have here is that we are trading one kind of hot spot for
|another. One of the problems we are trying to solve is the recurrence of
|problems like that poor soul that setup the Shomaker-Levy-9 stuff. In this
|case his machine may still not be that busy but the machine serving
|those URNs is going to be slammed to the wall. We need to be able to
|say that b1 resolves to multiple machines and that the resolution can
|happen on multiple non-cooperating resolution servers. I realize this
|can be done with caching of the DNS records but this function is not 
|ubiquitously good enough among all the bind implementations out there.
|I.E. I dont' think Microsoft's bind knock off will do it.
|
|The last problem is almost anecdotal. I maintain my campus' nameserver.
|It can serve the campus nameserver needs fairly well but if anything else
|is loaded on it I fear for our nameserver. This isn't just us either....;-)

The example here only lists one A record, but this is not prescribed.
The entry for b1 could have multiple A records. I don't know (yet) if
the client will see all the A records or if DNS will return only one.
The DNS resolution won't use gethostbyname() so there may be complete
control by the client on getting all the records. Do I need to state
explicitly that if mutiple A records occur they are equal in the sense
that any of them could be used to finish the resolution (and the results
would be identical no matter which one is chosen)?

I need to you explain the "Shomaker-Levy-9 stuff" and what you mean by
"Miscrosoft's bind knock off".

|As far as using HTTP is concerned it really doesn't matter what you use
|if all you are doing is URN lookup. What I would argue for is something
|a little bit more powerful that can handle at least limited URCs instead
|of just URNs. The possibilities of resource caching and replication become
|much much more valuable. If you do want to distribute lookup of more than
|just URNs then you need some form of query routing and at least a rudimentary
|query language in your protocol. This could be added to HTTP or we could
|use something like Z39.50 or whois++, I don't really care. Just as long
|as it has 3 things:
|
|1. query routing (preferably based on forward knowledge)
|2. multiple query languages based on users needs
|3. multiple data formats (i.e. MARC, TEI, whois++, HTML, etc).
|
|The last two don't seem apparent until you realize that the library community
|wants to play but they aren't about to scrap all of their MARC systems to 
|play. The same goes for the HyTime folx, HyperG, etc. If the system can
|dynamically map between these data formats and query languages then we
|have something that we can use to  solve some current problems and some 
|future ones with....
|

The use of http doesn't preclude URCs at all. The path scheme doesn't
require that the result is a URL. It could be any of the Content-types
for http. If and when the URC is defined as a valid content-type
then another internet-draft could specify the content-types for URCs
prehaps with Full-request header with Accept records to control the
http behavior when the GET resolves to a URC. 

Also, I have discussed with Dan the idea of adding the protocol into
the TXT record as well. This would allow path URNs to have different
protocols which the client discovers along with the server IP-address
and port.  This would mean that these URNs would first resolve into
protocol:ip-address:port, which then is used to finish the resolution.
If you think this useful we could add it.

-- 
Michael Shapiro                   mshapiro@ncsa.uiuc.edu
NCSA                              (217) 244-6642
605 E Springfield Ave. RM 152CAB  fax: (217) 333-5973
Champaign, IL 61820