RE: DSpace History White Paper - Comment Deadline

Hi Jason,

I have a few comments; some of these are perhaps responses to the learnings rather than comments on the document as such.

Section 2.1 History System Data Integrity:  If I understand this correctly, the history system is only storing the previous state of (e.g.) an Item to its current state?  E.g. if I have an item 1271.1/1234 in state X, and then do something (e.g. add a bitstream) which puts it in state Y, only state X will be stored in the history system?  If that is the case, it doesn't seem to be reflected in the sample output figures at the end of the document.  It would be good to show the link from the snapshot stored in the history to the 'live' version of the item (i.e. the link from the existential actuality to the universal actuality.)

Also on the topic of "data integrity", maybe this wasn't in the scope of your work but there's a question in my mind as to how appropriate the current history system is for meeting auditing or 'non-repudiatable' requirements.  Having a database-backed RDF store feels rather volatile and easy to change.  Additionally RDF doesn't feel to me to be very amenable to checksumming or timestamps or other secure/robust integrity checking mechanisms.  I'm not sure if it's in scope of your white paper, but this is something that warrants discussion.

I also don't see anything in the document about what might happen when e.g. an Item from Collection X is added to Collection Y.  How did you decide how/whether that is reflected in the history data?

I agree with your conclusion that integration points with the content management piece are very important.  (Though you do seem to contradict this in section 3.4.3 where you say through using RDF "the resulting architecture is very loosely coupled".  Also, capturing 'high level' context (e.g. a large transformation effort that happens over a few sessions/days/weeks, but that is part of a funded project) is very difficult.  This is outside the present scope of the content management component of DSpace, so being able to capture this has wide ramifications.

Although I agree that the "job performed by the subsystem is simple," I don't think the way it does it is simple!!  Any comments on whether choosing such a complex model as ABC was a good choice, and whether something simpler could fulfil the requirements of the History system?  You do mention that "flexibility provided by the model does make is difficult to choose a strategy to extend the model for a specific application."  

I'd like to see some discussion of how much information actually needs to be part of the History data.  (Again this might be out of scope of your white paper.)  At the moment, history store contains e.g. Item's Dublin Core fields as separate triples, and also the type of all these objects.  The history data and system could be made a lot simpler by just having serialised snapshots of the state of objects as single resources in the model (e.g. as METS files.)  This would make the amount of data in the actual History model much less, and hence the model simpler and easier to understand.  I think the model is so complex right now that providing a useful UI to the data that e.g. a librarian can use will be quite a challenge.

 Robert Tansley / Hewlett-Packard Laboratories / (+1) 617 551 7624 

-----Original Message-----
From: www-rdf-dspace-request@w3.org [mailto:www-rdf-dspace-request@w3.org] On Behalf Of Jason Kinner
Sent: 21 November 2003 08:17
To: www-rdf-dspace@w3.org
Subject: DSpace History White Paper - Comment Deadline


All -

If you have any comments on the DSpace history white paper, could you please get them to me by noon EST on Monday?

Thanks,

-Jason

Received on Friday, 21 November 2003 15:25:13 UTC