RE: Representing distinct item states

> I didn't realise it, but from this thought experiment it's 
> clear the history system is a rather advanced use case for 
> these ontology languages. Most of the examples of ontology 
> languages, like the wine ontology, assume the universe 
> described by the model is unchanging. 

Exactemundo!!  This is what I was trying to explain to Mick a while ago, I just wasn't very good at arguing it convincingly.

> There are probably some additional relationships we can 
> define here, for example we might want to relate dcq:replaces 
> and dcq:replacedBy to dcq:modificationOf as they are directed 
> versions of this relation, but we'd omit the 
> owl:SymmetricProperty statement. Really it would be nice to 
> have an "upper-level" ontology that built these kind of primitives. 
> 
> Alternatively I think we could express this with ABC, using 
> abc:precedes and anc:follows as they do in the AMOL Vase Example 
> http://metadata.net/harmony/JODI_Final.pdf
> Of course, in a way this is cheating - we are just leaving it 
> to ABC to provide an ontology description rather than coming 
> up with one by hand as above. I think they only provide a 
> description of ABC using RDFS, so it would be an interesting 
> exercise to map ABC into OWL. Any comments on this Dan?

I think ABC offers a reasonable stab at the mechanics we need, if we can simplify things a bit.  Quoted from Jason's reply...

> > This statement would be required for queries that (by
> > default?) would refer
> > to the then-current state of the resource.  I wonder if this 
> > complicates
> > search significantly.  Back to the naming v. resolution debate, this
> > technique would also make each revision of a resource resolvable, if
> > supported by DSpace.

I really don't think, given the purpose and use of the History system, that it needs to be optimised for query.  Unlike the rest of DSpace, it's a write-lots, query-not-so-often subsystem.  The important thing is that it has the right data, and has it securely.  We can always build/use tools that can do the indexing, inferring and optimising and point them at the History data later.

Queries like 'Which items in C have S metadata that were not prepared using T' do not in my opinion need to be reponsively fulfilled, unlike say a typical end-user's query of descriptive metadata looking for content.  I also very much do not think that the History system is going to be the place that people go to query 'live' or current data to answer typical end-users' queries.

One idea I think some people hold is that DSpace/SIMILE would have one unified RDF store which just contains all the triples.  I'm not sure if I'd agree with this.  There are some wide architectural decisions to be made.  I personally don't think that one unified RDF store against which all queries are made is a viable direction (certainly not at present, anyway); a far more realistic approach is to have 'authoritative' RDF stores, and then a separate 'live' layer (which may or may not be purely RDF-based) that has the data cached in some way optimised for efficient query.

In terms of the History system, I think a more viable approach for storing states of things would be to store them as serialised AIPs, making use of the METS AIP work that is to be done soon (hopefully; in terms of the DSpace effort I think that should be its no.1 priority at the moment.)  These METS AIPs could be represented as MD5/SHA-based URIs in the History store model, and since all of the metadata to do with individual items is held in a serialised form in these AIPs, the modelling tasks of the History system would be greatly simplified.

 Robert Tansley / Hewlett-Packard Laboratories / (+1) 617 551 7624

Received on Monday, 19 May 2003 10:48:11 UTC