Re: FW: notes on use cases from Kevin Smathers on 2003-04-07 (www-rdf-dspace@w3.org from April 2003)

From: Kevin Smathers <ks@micky.hpl.hp.com>
Date: Mon, 7 Apr 2003 08:47:07 -0700
To: "David R. Karger" <karger@theory.lcs.mit.edu>
Cc: Mark_Butler@hplb.hpl.hp.com, www-rdf-dspace@w3.org
Message-ID: <20030407084707.A31622@micky.hpl.hp.com>
On Mon, Apr 07, 2003 at 02:10:59AM -0400, David R. Karger wrote:
>    Randomness avoids collision?  I would say rather that federation
>    avoids collision, randomness only invites it.  
> 
> Depends.  If URLs are random 128-bit integers, collisions are
> hell-freezes-over unlikely.
> 

Assuming one has a mechanism for enforcing randomness, then I agree.
The problem with MD5 sums is that the contents of the URL become
immutably linked to the URL itself.  

Invariant documents have lots of nice features; distribution, cache 
control, and cache verification become trivial, but on the down side
there is no consistent address for the top of tree of a document history.
If you want to be able to modify your document after publishing its
MD5-sum URL, then you will also have to accept that maintaining 
randomness is cooperative.

Cooperative randomness reduces the likelihood of collision from 
hell-freezes-over unlikely, to logical-truth.

[...]
> 
>    The URL
>    remains the same, there just needs to be a way of interpreting additional
>    constraints on the content within that URL.  I think this is analogous
>    to the identification of a ViewPart to extract a particular view of
>    an object within Haystack (do I remember a SongPreview10Seconds View
>    Part, or something similar?).  In this case the preview isn't meant
>    to be an aspect of the part however, but a name, which should be 
>    interpreted by the part to extract the relevant information.
> 
> Having trouble parsing this.
> 

Perhaps I can clarify with an example.

Consider a DVD archive that contains the theatrical release of "The Lord 
of the Rings". The URL for this sample asset for the sake of argument 
is 'http://simile.org/the-lord-of-the-rings-theatrical-release.dvd'.

Now suppose I have created a DVD player that will read metadata 
describing any movie and use it to modify the way that movie is played 
back.  For example, my DVD player can read metadata describing scenes 
that depict violence, and remove them during playback of the movie.

Obviously the metadata read by the DVD player will have to include
data that identifies the parts of the overal movie that represent the
selected content.  Using a URL to represent the content is insufficient
-- we can't create new URL's for every possible subregion of a movie, 
and even if we did so, such an approach wouldn't help in finding an 
playing back parts of the movie that do not correspond to that URL.

Naming, as is being described in section 3.2.7, has nothing to do with
the URL for the asset.   The purpose of naming is to create a linkage
between the metadata and the movie subregion.

Stepping out of our example, the purpose of Naming in this document
is to represent other assets in ways that URLs cannot.  Such linkages
are neccessarily specific to the type of data being indexed so they
cannot be generalized to a single technology, but that doesn't mean 
that we can't create a pattern around them.

The rest of that paragraph is a (probably lame) attempt to link this 
pattern to its structural equivalent in Haystack.

> 
>    To my mind, Semantic Web without the Web is just Semantic Filesystem.
> 
> Perhaps, but nobody knows how to build a decent semantic filesystem.
> While one might argue that google (analogue of what I said about
> centralized scenario above) is "just filesystem", it is in fact a big
> step forward because all the information it centralizes is interlinked
> in an interesting way.
> 

I would not argue that Google is 'just filesystem'.

I think that Google's approach of gathering metadata and querying 
locally is a completely valid approach to solving the distribution 
problem for metadata.  There may be other approaches that are 
also useful, especially for specific types of data such as invariate 
data, or strongly partitioned (federated) data.

Cheers,
-kls
-- 
========================================================
   Kevin Smathers                kevin.smathers@hp.com    
   Hewlett-Packard               kevin@ank.com            
   Palo Alto Research Lab                                 
   1501 Page Mill Rd.            650-857-4477 work        
   M/S 1135                      650-852-8186 fax         
   Palo Alto, CA 94304           510-247-1031 home        
========================================================
use "Standard::Disclaimer";
carp("This message was printed on 100% recycled bits.");
Received on Monday, 7 April 2003 11:24:19 UTC