RE: Representing distinct item states

This is an excellent point, originally brought up during the first telcon to
review the descriptive note.  My recommendation would be to annotate each
revision (state) node URI with a state identifier (version ID).

For example:

hdl:1271.1/5678;1 ----- dc:title -----> My Super Researck Paper

then:

hdl:1271.1/5678;2 ----- dc:title -----> My Super Research Paper

then:

hdl:1271.1/5678 ----- harmony:hasRealization ----> hdl:1271.1/5678;1
                ----- harmony:hasRealization ----> hdl:1271.1/5678;2
hdl:1271.1/5678;1 ----- dcq:isReplacedBy -----> hdl:1271.1/5678;2
hdl:1271.1/5678;2 ----- dcq:replaces -----> hdl:1271.1/5678;1

This is implicit in the current construction of state information for
Harmony structures in the history system.  Like the bitstream IDs that I
reference so often, states are represented by numeric identifiers that do
not provide a navigable (in RDF, anyway) chain of revisions for the metadata
of a DSpace resource.  This maps to Rob's suggestion, but with
hdl:1271.1/5678;1 for XXXXXX and hdl:1271.1/5678;2 for YYYYYY.  There would
also need to be a statement such as:

hdl:1271.1/5678 ----- owl:sameIndividualAs -----> hdl:1271.1/5678;2 (the
current revision)

This statement would be required for queries that (by default?) would refer
to the then-current state of the resource.  I wonder if this complicates
search significantly.  Back to the naming v. resolution debate, this
technique would also make each revision of a resource resolvable, if
supported by DSpace.

Regarding the scope of changes, this is typically addressed by an explicit
operation that the user invokes to annotate a new state.  For example, in a
document management system, I can check out/check in different versions of
an individual document, but the current version of the folder (or compound
document) remains the same.  When I "release" or "publish" a compound
document, a snapshot of state is taken that can later be referenced
explicitly.  This is similar to labeling a revision in source code control.

I would suggest that keeping change contained to the nearest object in the
graph would be a good way to go for any object.  I believe that adding or
removing an Item to/from a Collection would need to be modeled in History,
but it is an open question whether the revision of the Collection would be
incremented in this case.

There is a larger question here about referencing any item in DSpace and
whether references always or sometimes point to the current revision or to a
specific revision.  Most commercial systems distinguish these two kinds of
references.

-Jason

-----Original Message-----
From: www-rdf-dspace-request@w3.org
[mailto:www-rdf-dspace-request@w3.org]On Behalf Of Tansley, Robert
Sent: Friday, May 16, 2003 4:37 PM
To: (www-rdf-dspace@w3.org)
Subject: Representing distinct item states



I think there's a big missing piece from the history discussion and model.
It's to do with representing different states of Items (and other objects.)

To illustrate it, consider a super-simple example of a DSpace Item,
hdl:1271.1/5678, which has its dc:title fixed by an administrator.  Now, the
history data should NOT contain (as I believe it currently does) this
triple:

hdl:1271.1/5678   ----- dc:title ----->  My Super Researck Paper

because, later, when the administrator corrects the title, the following
triple will be deposited:

hdl:1271.1/5678   ----- dc:title ----->  My Super Research Paper

This wouldn't do, because when the data is collected into a unified store
for query, there would be no way of telling with which state each dc:title
is associated.  (It's by virtue of the statements being in separate files
that one can tell right now.)

We need a way of representing the item in a particular state (i.e. at a
particular moment in time.)  I'm not sure which part of the Harmony ABC
model would best be used to do this; the examples in
http://www.metadata.net/harmony/JODI_Final.pdf don't quite fit.  One
possible way to consider is, in the context of the Harmony ABC model, to let
a DSpace Item be a 'Work'.  'Manifestations' are individual states of that
DSpace Item.  Then, part of the data in the history system when the
administrator corrects the DSpace Item would look like this:

hdl:1271.1/5678  ---- harmony:hasRealization ---->  XXXXX
XXXXX  ----- dc:title ----->  My Super Researck Paper

hdl:1271.1/5678  ---- harmony:hasRealization ---->  YYYYY
YYYYY  ----- dc:title ----->  My Super Research Paper

Now the relevant ABC bits and bobs can relate XXXXX and YYYYY, explaining
how we started with XXXXX and arrived at YYYYY.  (Maybe there's a more
appropriate part of the model than Work/Manifestation but this demonstrates
what is missing from the current model.)

Other gnarly questions are:

* Do individual Bitstreams and Bundles have separate states, or are they
married to Item states?  For example, if I change what is in a Bundle, does
that bundle get it's own states and transitions modelled, or do we just deal
with those events at the Item level?

* If a new Item is added to a Collection, does that mean the Collection has
changed and is now in a different state?  Ditto Community/Collection, ditto
Community/Collection/Item.  Gnarliest of gnarly things, if I change a
Bitstream that's in a Bundle that's in an Item that's in a Collection that's
in a Community, does that mean the Item's changed?  Does that Item changing
mean the Collection's changed?  Does that Collection changing then mean the
Community's changed?

I think we really need some example walkthroughs of typical events, with
graphs along the lines of Appendices C-F of
http://www.metadata.net/harmony/JODI_Final.pdf.  The way the work is
currently being discussed feels to me like we're examining individual trees
but have no idea what the forest looks like.

 Robert Tansley / Hewlett-Packard Laboratories / (+1) 617 551 7624

Received on Saturday, 17 May 2003 12:05:56 UTC