Re: Representing distinct item states from Kevin Smathers on 2003-05-19 (www-rdf-dspace@w3.org from May 2003)

From: Kevin Smathers <kevin.smathers@hp.com>
Date: Mon, 19 May 2003 12:23:25 -0400 (EDT)
To: "Tansley, Robert" <robert.tansley@hp.com>
Cc: "Butler, Mark" <Mark_Butler@hplb.hpl.hp.com>, "'Jason Kinner'" <jason_kinner@dynamicdigitalmedia.com>, www-rdf-dspace <www-rdf-dspace@w3.org>
Message-ID: <3EC905D1.2040005@hp.com>

Tansley, Robert wrote:

>I really don't think, given the purpose and use of the History system, that it needs to be optimised for query.  Unlike the rest of DSpace, it's a write-lots, query-not-so-often subsystem.  The important thing is that it has the right data, and has it securely.  We can always build/use tools that can do the indexing, inferring and optimising and point them at the History data later.
>  
>
I thought that Mick's goal for the integration of DSpace and Haystack 
was in part to allow users to browse along the dimensionality of the 
history system as easily (and thus as quickly) as they browse the 
current fixed hierarchy. 

>Queries like 'Which items in C have S metadata that were not prepared using T' do not in my opinion need to be reponsively fulfilled, unlike say a typical end-user's query of descriptive metadata looking for content.  I also very much do not think that the History system is going to be the place that people go to query 'live' or current data to answer typical end-users' queries.
>  
>
At least one vector within the history system is likely to be very 
common -- the list of recent changes.   It isn't too hard to imagine 
navigating to a subregion of the database and then following a history 
vector which should be constrained to the histories that apply to the 
current navigation context either.  I think in the end that you'll find 
most queries of the history system will be more common than you suppose 
here.  As a comprehensive list 'Which items in C have S metadata that 
were not prepared using T' might seem an obscure query, obscure queries 
can easily by constructed piecemeal by the user selecting an unusual 
vector for the next stage of narrowing their search.

>[...]
>
>In terms of the History system, I think a more viable approach for storing states of things would be to store them as serialised AIPs, making use of the METS AIP work that is to be done soon (hopefully; in terms of the DSpace effort I think that should be its no.1 priority at the moment.)  These METS AIPs could be represented as MD5/SHA-based URIs in the History store model, and since all of the metadata to do with individual items is held in a serialised form in these AIPs, the modelling tasks of the History system would be greatly simplified.
>
>  
>
It is interesting that although every other computer language has opted 
to combine pointer/reference syntax with structure syntax, the Web 
chooses instead to divide XML and RDF thereby requiring all RDF 
advocates shoehorn structures into references, and conversely requiring 
XML advocates to come up with bizarre query languages for representing 
pointers.

-- 
========================================================
   Kevin Smathers                kevin.smathers@hp.com    
   Hewlett-Packard               kevin@ank.com            
   Palo Alto Research Lab                                 
   1501 Page Mill Rd.            650-857-4477 work        
   M/S 1135                      650-852-8186 fax         
   Palo Alto, CA 94304           510-247-1031 home        
========================================================
use "Standard::Disclaimer";
carp("This message was printed on 100% recycled bits.");

Received on Tuesday, 20 May 2003 13:16:35 UTC