Re: Belated comments on OCLC's proposal

Mitra (mitra@regis.prod.kaworlds.com)
Wed, 21 Jun 1995 11:18:18 -0700


Date: Wed, 21 Jun 1995 11:18:18 -0700
Message-Id: <ac0dad5203021004fdd7@[204.217.20.30]>
To: tkac@oclc.org (Vincent Tkac)
From: mitra@regis.prod.kaworlds.com (Mitra)
Subject: Re: Belated comments on OCLC's proposal
Cc: uri@bunyip.com, weibel@oclc.org (Stu Weibel)

At 12:55 PM 6/21/95, Vincent Tkac wrote:
>> My objection is that this scheme requires a choice between persistence and
>> name space, i.e. if objects I produce are to have persistence then I need
>> to register in the FLAT name-space. A hierarchical name-space, as used in
>> our scheme (and others) allows us to have both.
>
>Our name space supports both a hierarchical name space (DNS with FQDN or IP)
>and a "flat" registered name space.  The primary focuses of the I-D are
>services, unregistered name authorities and RPs not the 'flatness' of
>the name authority space.
>
>The I-D is being changed to include a hierarchy delimiter in the name
>authority ID.  It will be possible to extend the name space hierarchically
>if necessary.

Vincent - this gets back to the main tricky point, which is how to locate
the resolver fast in a hierarchical name space.
>
>
>> >  > 3.1 "can be resolved by *a* name authority ID resolver"
>> >  > If this is really "a" resolver, then how does the client determine which
>> >  > one to use, if it should be worded "any" then we have to have a
>>fully (not
>> >  > partially) replicated Naming Authority Service.
>> >
>> >We are in agreement with the Handle folks that there needs to be a global
>> >naming authority registry, so that a client can find out that:
>> >
>> >  N2L://OCLC/12345
>> >
>> >can be resolved by a machine named, say, urn.oclc.org:nnnn
>> >
>> >Again, we expect there to be many fewer NAs than DNS hosts, so such
>> >mappings will be widely cached, and the global resolution service will
>> >be invoked primarily to refresh caches.
>>
>> This all depends on your assumptions about the size of the NA space. Please
>> correct the document to make it clear that you are dependant on a single
>> replicated Naming Authority Service. I believe this is enough to make this
>> proposal unscalable.
>
>Note that the NAs in our proposal are of two types: registered NAs and
>unregistered NAs.  The registered NAs are dependent on a single,
>replicated, cachable Naming Authority Registry.  This is the top level
>of the tree.  We are not depending on a single replicated Name Resolver.
>The unregistered NAs are dependent on DNS.

Its the same issue, depending on a single registry, even if its cached, and
replicable is a problem because the database to be replicated is huge, and
updates/procedures to keep it up to date will put a significant load on the
system. By going hierarchical - as in your earlier comment - you avoid this
problem, but then you have to describe how (in your scheme) you locate the
registry.

>
>> > eg. 1 You need to show, in a document, that different places resolve
>> >       the same URN differently (a very plausible event in, for example,
>> >       a CERT advisory).  This can only be done by specifying
>> >       different resolution paths for identical URNs.
>>
>> A CERT advisory such as this is a textual document describing a breakdown in
>> a system, since (under normal usage) URN's should be resolved the same way
>> no matter where you start the process, then showing a URN resolving
>> differently is purely an artifact of how you right your document. E.g.
>> "CERT Advisory - the URN resolution cache at xyz.com is misresolving URNs
>> to point to trojan.horses.com instead of their real home". I can't think of
>> a real example where this is required.
>
>We disagree that a URN should be resolved to the same instantiation
>of an object or its metadata.  Different companies and organizations can
>and will resolve the same URN to differing instantiations.  This may
>be a result of political, institutional, etc. requirements to deliver
>a particular type (or screen other types) of information.
>The URN system must, by design, facilitate this ability.  If
>it does not then this ability will be achieved outside of the URN scheme.

I believe retrieval may happen from different places, resolution shouldn't.
>
>If I am a user with a URN of LOC/1234 and OCLC charges $1.00 to resolve it to
>some URC, I want the ability to go to GNU and get a, perhaps less complete,
>URC for free.
>
>Similarly, the URN NYSE/IBM can be resolved by various resolution servers.
>Do you disagree that different companies can resolve this differently or
>that one company may charge for a stock quote while another doesn't?

Not at all - but I disagree that this should happen at the resolution step,
the resolution should return a list of URL's along with attributes (such as
cost), then your client should pick which of the URLs to choose. Resolving
URN:NYSE/IBM tells you who serves this data, it doesnt give you the data.
>
>> But in this example, you are modifying a service not a URN. In this case
>> you have specified a service and expected the client to know what protocols
>> to use to talk to it. If I see the above identifier, I'm not going to know
>> what protocol to talk to this service. If it is a new HTTP service which
>> turns URN's into citations it would be better expressed as a URL for the
>> search service, where the URN was a paramater of the search.
>
>We are modifying what the URN should be resolved to (the info we are
>requesting) not the protocol that the information will be returned in.
>We would still be using the same protocol.  There are many reasons why
>turning it into a URL is a bad idea.  The most obvious is that we then
>impede persistence.

I think you misread my paragraph. I'm suggesting the search SERVICE is a
URL, the object being searched for would still be a represented as a URN.
>
>
>>  > 6.0 Fulfillment of Requirements
>
>> >Persistence is promoted by the use of Registered URNs.  The usefulness of
>> >having an informal path using the same syntax in no way impedes
>> >persistence, and in fact, may result in a URN space marginally less
>> >cluttered with ephemera that do not need persistent naming.
>>
>> Of course it impedes it, these unregistered URNs are not persistent, and
>> without having them name a substantial portion of documents the URN service
>> won't scale. Its almost the same as saying we have a URN service which will
>> only work if its only used for a small fraction of the total documents
>> created.
>
>Scalability aside (since it is discussed elsewhere) it does not impede
>persistence.  If one wants a persistent URN it is possible to have one.
>Persistence is a function of the entire system not just the name scheme.

This should go back to the requirements process then, I understood the
requirements as being a union, not an exclusive OR. I don't believe that a
service that provides Scalability OR Persistance for each URN fulfills the
requirements. I believe we require some that requires a scalable number of
persistant URNs.

>> >  > Scalability: NO - see comment on 1.0 IV
>> >
>> >There will be many fewer NAs than domain names... probably by several
>> >orders of magnitude.  The Name Authority Registry can be widely cached,
>> >providing fast, local lookup for resolution services.  We adopt Paul's
>> >suggestion of a reserved character (#) to provide for expansion of the
>> >hierarchy in the future if necessary.
>>
>> I just don't believe this premise, every student at some time produces a
>> document that is worthy of a persistant pointer.
>
>Every student will not become a naming authority.  If it was the case that
>every student that produced a paper had to become a registered naming
>authority and incur the cost of indefinitely provide resolution services
>of various types, the system would not be maintainable and would not be used.
>
>What will realistically happen is that when a student produces a paper that
>is worthy of a persistent name, the name will be assigned by either the
>university, or department, of the particular student.  A publication that
>is considered worthy of persistence (one that has been peer-reviewed,
>re-written, re-submitted, etc.) would probably be deposited in
>a publishers repository.

So a very significant number of objects wont be representable by persistant
URNs which gives us the same quality problems we have with current URLs.


- Mitra

=======================================================================
Mitra                                                mitra@kaworlds.com
Worlds Inc                                                (415)281-1308
<http://earth.path.net/mitra>                         fax (415)284-9483