RE: Representing distinct item states

The focus on metadata modeling and tracking changes in metadata is heavy
in our discussions of the History System.  I have heard the History
System referred to as a "metadata repository" as a "provenance record"
and even as "The RDF Repository" for DSpace.  What the History System
does, however, is generate metadata based on events that occur within
DSpace.  History is not intended to capture metadata changes, but it is
required to generate metadata that will record various changes to the
items, bundles, and bitstreams in DSpace.  In this context, it may need
to track the changes in its own metadata representation of these objects
(which may include user metadata).  Also, from this perspective one does
not "store metadata" in History, but rather History stores generated
metadata in X (X being the file system in the current implementation,
and Jena in the next).

MacKenzie made an excellent point about policy and its importance to the
History System in the last PI call.  If we consider History as a
metadata generator, then these (missing) policies can define some of the
behavior of History and can answer some of the more recent questions
posed on this list.  In other words, it is a policy decision whether
adding a bitstream to a bundle changes the containing item.  In most use
cases that I can think of, maximum utility is achieved by not
propagating changes out of the local object, but I can only be confident
enough about this decision to make it the default behavior.  Another
policy decision may be whether to record the replacement of one
bitstream in the same bundle with a new bitstream with the same content.
Depending on the application, someone may or may not need this
information.

Now, all this is out of the scope of my present efforts in "modernizing"
the History System, but I believe it is all instructive in the metadata
modeling requirements for generated metadata.  It is also encouraging to
me that Harmony ABC and the other root schemas that I have been modeling
against could support multiple policy decisions.

Jason Kinner
Dynamic Digital Media, LLC
856.296.5711 (mobile)
215.243.7377 (phone)
http://www.dynamicdigitalmedia.com

-----Original Message-----
From: www-rdf-dspace-request@w3.org
[mailto:www-rdf-dspace-request@w3.org] On Behalf Of MacKenzie Smith
Sent: Monday, May 26, 2003 10:11 PM
To: www-rdf-dspace@w3.org
Subject: RE: Representing distinct item states


OK, I finally caught up on this thread (only a week behind -- not so
bad) and here are a couple of gentle reality checks for you all to
consider. I also realize this is coming too late for the current
descriptive note, but that's ok.

I think we have to remember what it is we're using this here
History system for. It's the content, not the metadata (and don't
tell me that metadata is content -- you know what I mean).

-- No one who actually manages archives expects to track
changes to the metadata over time. In traditional library/information
management systems we keep logs around to track metadata
changes temporarily, but it's just not considered important to the
core mission of managing the *content* over time... as you've all
noted, schemas change, contexts change, resources get
described in myriad ways (all at the same time), people make
mistakes, fix them, we add stuff, we remove stuff, and so what.

I also think it's very unlikely that systems like Haystack would
really want access to History data. It's really hard to imagine a
case where that would be interesting, except to a curator, who
will presumably have other tools to query the data.

-- So I think Rob's right -- we need some use cases for our use
case. I'm going to start poking around the archivists to see what
kinds of information they track over time about the analog resources
they're managing now. I suspect it's quite minimal -- you need to
know if a thing (manifestation, edition, pick your favorite ontology)
has been copied, changed, reformatted, etc., when, how, and by
whom.

The only reason to store metadata in the History system at all
is to know what you're looking at when you're examining the
provenance of a piece of content. It would be nice if that metadata
were current and correct, but it doesn't seem that critical.

And what will you do if there are three or four metadata records
(i.e. items) describing the same bitstream(s)? Something we know
is going to happen over time, as content gets reused and
repurposed.

To recap: it's not the metadata we're stewarding and preserving
here, it's the content, which happens to have some useful
metadata associated with it to allow users discover it, and
manages to manage it.

Apologies again for the late reply,

MacKenzie


MacKenzie Smith
Associate Director for Technology
MIT Libraries
Building 14S-208
77 Massachusetts Avenue
Cambridge, MA  02139
(617)253-8184
kenzie@mit.edu

Received on Tuesday, 27 May 2003 00:13:57 UTC