- From: Patrick Stickler <patrick.stickler@nokia.com>
- Date: Mon, 26 Jan 2004 10:26:50 +0200
- To: "ext Phil Dawes" <pdawes@users.sourceforge.net>
- Cc: ext Sandro Hawke <sandro@w3.org>, "ext Hammond, Tony (ELSLON)" <T.Hammond@elsevier.com>, "Thomas B. Passin" <tpassin@comcast.net>, ext Jeremy Carroll <jjc@hplb.hpl.hp.com>, www-rdf-interest@w3.org
On Jan 23, 2004, at 06:28, ext Phil Dawes wrote: > Hi Patrick, > > Patrick Stickler writes: >> >> http: based PURLs work just fine. As I've pointed out before, you >> can accomplish all that you aim to accomplish with the info: URI >> scheme by simply using http: URIs grounded in your top level >> domain, delegating control of subtrees of that namespace to the >> various managing entities per each subscheme (the same is true >> of urn: URIs). Then each http: URI can be associated with an >> alias to which it redirects, as well as allow for access to >> metadata descriptions via solutions such as URIQA[1]. E.g. >> rather than >> >> info:lccn/n78890351 >> >> you'd have >> >> http://info-uri.info/lccn/n78890351 >> > > But if these are non-dereferencible URIs, They are not non-dereferencable. They simply may not resolve to any representations. There's a difference. > how do you stop every RDF > web-crawler, information gatherer and clueless agent on the planet > from attempting to HTTP-GET/MGET the billions of URIs in the > namespace? Why would you care to? And what if key authoritative knowledge *is*, by common good practice, made available via resolution of those URIs? Why would you want to prohibit such an effective means of knowledge interchange? > Unless I'm missing something, as the number of these scale up, so to > do does the amount of resources used in tackling 404'd requests. > I'm sorry, but it seems to me that you are seeing ghosts and phantoms where there are none. If a GET or MGET to some URI fails to provide a useful response, then it just does. And crawlers will just move on. But if those requests *do* provide a useful response, then those indices are all the more useful. My "vision" is that there would arise a new breed of crawler that would be gathering knowledge about resources (rather than merely indexing textual content of representations) allowing for far more precise, effective searching of web-accessible content. With URIQA[1] in conjunction with HTTP, one URI can (potentially) provide you representations and/or a formal (RDF) description. Note the "and/or". For any given resource, there may only be available a representation, or a description (e.g. for vocabulary terms), or both. Thus, because I consider knowledge about resources (even terms in various controlled vocabularies such as are a primary focus of the info: URI scheme) to be highly valuable information that should be accessible to software agents in a consistent, efficient manner, I simply can't fathom any real benefit to having a URI which, by definition, cannot be used to access such knowledge. Here's a simple use case to illustrate: a software agent encounters some URI info:foo:blargh. It has no idea what it means. It's stuck. Or it has to rely on proprietary, hard coded means to discover what that term means. Alternately, there is rather an analogous URI http://info.org/foo/blargh where the owner of that URI has made accessible an RDF description of that resource whenever a request MGET http://info.org/foo/blargh HTTP/1.0 is issued (the details of the resolution being based on redirection from the root info.org server to the subnamespace owner's server, etc.) Now, that software agent has a formal definition of what that URI denotes, and can (possibly/hopefully) do something useful with that information in its subsequent processing. Now, maybe most folks who would mint http://info.org/* URIs won't care or bother to provide either representations or descriptions of their resources -- but for those who do, we can all then exploit a globally deployed, consistent, and efficient solution for accessing those representations and descriptions. It is for this reason that I am against solutions such as the info: URI scheme which deliberately hobble the web rather than leaving it up to the users to decide (especially since such decisions can change, and if your technology precludes changing your mind, then your just out of luck). > The only solution I can think of is to invent a dud subdomain (that > doesnt exist) and let the DNS infrastructure deal with the 'doesn't > exist' load (which it's much better placed to do). > > But then if you are going to do that, why not just invent a > non-dereferencible URI scheme... Doh! > Again, I just don't see why folks would be opposed to (potentially) dereferencable URIs. Patrick > Cheers, > > Phil > > > -- Patrick Stickler Nokia, Finland patrick.stickler@nokia.com
Received on Monday, 26 January 2004 03:35:59 UTC