Re: URI: Name or Network Location?

>  >     info:lccn/n78890351
vs
>  >     http://info-uri.info/lccn/n78890351
> 
> But if these are non-dereferencible URIs, how do you stop every RDF
> web-crawler, information gatherer and clueless agent on the planet
> from attempting to HTTP-GET/MGET the billions of URIs in the
> namespace?  
> Unless I'm missing something, as the number of these scale up, so to
> do does the amount of resources used in tackling 404'd requests.

I'll be surprised if that turns out the be the big inefficiency of the
Semantic Web.  Retrieving megabytes of data that turns out not to be
what you want -- that's much worse.  Especially if you have to compute
for a long time to know if it's useless stuff.  So I expect metadata
to be very valuable, saying which URIs are useful for what.  Kind of
like the stuff search engines already store about each URI.

> The only solution I can think of is to invent a dud subdomain (that
> doesnt exist) and let the DNS infrastructure deal with the 'doesn't
> exist' load (which it's much better placed to do).
> 
> But then if you are going to do that, why not just invent a
> non-dereferencible URI scheme... Doh!

Because it lets you change your mind later.   And if each organization
(lccn, not info-uri.info) provides their own domain, it can be changed
on an per-organization basis.

Mostly I think it'll turn out to be useful and cost-effective to make
all these URIs dereferenceable.  When people really understand how
this all works, they'll realize it's often dumb to make a
non-dereferenceable identifier.  If I'm going to go to the trouble to
create and publish an identifier for something, I want to leverage my
owning that identifier by getting into the dereference loop among
folks who choose to dereference.  But we'll see.  I gather Atom folks
are now using my non-dereferenceable tag: URI scheme, despite my
argument against it.

     -- sandro

Received on Saturday, 24 January 2004 15:05:21 UTC