- From: David R. Karger <karger@theory.lcs.mit.edu>
- Date: Mon, 7 Apr 2003 02:10:59 -0400
- To: ks@micky.hpl.hp.com
- CC: Mark_Butler@hplb.hpl.hp.com, www-rdf-dspace@w3.org
Date: Fri, 4 Apr 2003 15:22:22 -0800 From: Kevin Smathers <ks@micky.hpl.hp.com> Cc: SIMILE public list <www-rdf-dspace@w3.org> Content-Disposition: inline X-SBClass: Nonlocal Origin [156.153.255.206] X-Spam-Status: No, hits=-4.4 required=5.0 tests=IN_REP_TO version=2.20 X-Spam-Level: X-SpamBouncer: 1.5 (2/23/03) X-SBPass: NoBounce X-SBClass: OK X-Folder: Bulk Hi David, > -----Original Message----- > From: David R. Karger [mailto:karger@theory.lcs.mit.edu] > Sent: 04 April 2003 06:09 > To: mick.bass@hp.com; Mark_Butler@hplb.hpl.hp.com > Subject: notes on use cases [...] > 3.2.7 > > Two issues seem scrambled here. One is "how are things named" and the > obvious answer is "URIs". A separate one is "Should there be > canonical URIs that can be deduced from what you are looking for"? I > believe the answer to the second is no. URIs should be opaque (for > example, random to avoid collisions). The process of "figuring out > the right URI for something" is a type of search/retrieval problem. > Instead of squeezing this search/retreival into a specialized "figure > out the URL" task, incorporate it in the standard search framework. > Any information used to define a canonical URL can instead be used as > metadata on the object, and any knowledge of how to construct the URL > can then be turned into a specification of its metadata. > Randomness avoids collision? I would say rather that federation avoids collision, randomness only invites it. Depends. If URLs are random 128-bit integers, collisions are hell-freezes-over unlikely. In any case, I wasn't trying to describe a system of URL naming which incorporates a query into the name, eg: http://google.com/search?q=switch+stoned+chicks&btnI=FeelingLucky http://www.apple.com/switch/ads/ellenfeiss.html Rather I intended the description of naming to read as a system for identifying resources smaller than the atomic document. This is arguably convenient, in that it permanently binds the smaller object to its containing object, giving you the semantics that if you are looking for the smaller object it is a good subgoal to look for the containing object. But what if the contained object is inside two distinct objects? Which URL is right? What if someone doesn't know the object is contained? They will give it a third URL. I will put in a plug for my favorite URL when possible, namely an MD5 hash of the object. This is possible whenever the object is its bits---eg a document, an email address, a (reified) RDF triple, but not a person or a dynamic web site. The nice things about MD5 urls is that they provide a canonical naming rule that reduces the odds of getting multiple names for the same object, reducing need for inference about equivalence. The URL remains the same, there just needs to be a way of interpreting additional constraints on the content within that URL. I think this is analogous to the identification of a ViewPart to extract a particular view of an object within Haystack (do I remember a SongPreview10Seconds View Part, or something similar?). In this case the preview isn't meant to be an aspect of the part however, but a name, which should be interpreted by the part to extract the relevant information. Having trouble parsing this. > 3.4 > > Distributed resources adds whole different scope. It opens up a host > of nasty problems of course. We could avoid them by limiting our > dealing of with distributed metadata to devising a simple > block-transfer protocol, getting all the metadata to a single > location, and dealing with it there. The metadata might not be fully > up to date, but it avoids a lot of trouble. Since even in centralized > scenario everything is hard, perhaps we defer distributed search? > To my mind, Semantic Web without the Web is just Semantic Filesystem. Perhaps, but nobody knows how to build a decent semantic filesystem. While one might argue that google (analogue of what I said about centralized scenario above) is "just filesystem", it is in fact a big step forward because all the information it centralizes is interlinked in an interesting way. d
Received on Monday, 7 April 2003 02:07:05 UTC