W3C home > Mailing lists > Public > www-rdf-dspace@w3.org > May 2003

RE: Representing distinct item states

From: Jason Kinner <jason_kinner@dynamicdigitalmedia.com>
Date: Sat, 17 May 2003 12:10:59 -0400
To: "www-rdf-dspace" <www-rdf-dspace@w3.org>
Message-ID: <DJEBJNMLACLGDFEJMPDLOECPCDAA.jason_kinner@dynamicdigitalmedia.com>

This is an excellent point, originally brought up during the first telcon to
review the descriptive note.  My recommendation would be to annotate each
revision (state) node URI with a state identifier (version ID).

For example:

hdl:1271.1/5678;1 ----- dc:title -----> My Super Researck Paper

then:

hdl:1271.1/5678;2 ----- dc:title -----> My Super Research Paper

then:

hdl:1271.1/5678 ----- harmony:hasRealization ----> hdl:1271.1/5678;1
                ----- harmony:hasRealization ----> hdl:1271.1/5678;2
hdl:1271.1/5678;1 ----- dcq:isReplacedBy -----> hdl:1271.1/5678;2
hdl:1271.1/5678;2 ----- dcq:replaces -----> hdl:1271.1/5678;1

This is implicit in the current construction of state information for
Harmony structures in the history system.  Like the bitstream IDs that I
reference so often, states are represented by numeric identifiers that do
not provide a navigable (in RDF, anyway) chain of revisions for the metadata
of a DSpace resource.  This maps to Rob's suggestion, but with
hdl:1271.1/5678;1 for XXXXXX and hdl:1271.1/5678;2 for YYYYYY.  There would
also need to be a statement such as:

hdl:1271.1/5678 ----- owl:sameIndividualAs -----> hdl:1271.1/5678;2 (the
current revision)

This statement would be required for queries that (by default?) would refer
to the then-current state of the resource.  I wonder if this complicates
search significantly.  Back to the naming v. resolution debate, this
technique would also make each revision of a resource resolvable, if
supported by DSpace.

Regarding the scope of changes, this is typically addressed by an explicit
operation that the user invokes to annotate a new state.  For example, in a
document management system, I can check out/check in different versions of
an individual document, but the current version of the folder (or compound
document) remains the same.  When I "release" or "publish" a compound
document, a snapshot of state is taken that can later be referenced
explicitly.  This is similar to labeling a revision in source code control.

I would suggest that keeping change contained to the nearest object in the
graph would be a good way to go for any object.  I believe that adding or
removing an Item to/from a Collection would need to be modeled in History,
but it is an open question whether the revision of the Collection would be
incremented in this case.

There is a larger question here about referencing any item in DSpace and
whether references always or sometimes point to the current revision or to a
specific revision.  Most commercial systems distinguish these two kinds of
references.

-Jason

-----Original Message-----
From: www-rdf-dspace-request@w3.org
[mailto:www-rdf-dspace-request@w3.org]On Behalf Of Tansley, Robert
Sent: Friday, May 16, 2003 4:37 PM
To: (www-rdf-dspace@w3.org)
Subject: Representing distinct item states



I think there's a big missing piece from the history discussion and model.
It's to do with representing different states of Items (and other objects.)

To illustrate it, consider a super-simple example of a DSpace Item,
hdl:1271.1/5678, which has its dc:title fixed by an administrator.  Now, the
history data should NOT contain (as I believe it currently does) this
triple:

hdl:1271.1/5678   ----- dc:title ----->  My Super Researck Paper

because, later, when the administrator corrects the title, the following
triple will be deposited:

hdl:1271.1/5678   ----- dc:title ----->  My Super Research Paper

This wouldn't do, because when the data is collected into a unified store
for query, there would be no way of telling with which state each dc:title
is associated.  (It's by virtue of the statements being in separate files
that one can tell right now.)

We need a way of representing the item in a particular state (i.e. at a
particular moment in time.)  I'm not sure which part of the Harmony ABC
model would best be used to do this; the examples in
http://www.metadata.net/harmony/JODI_Final.pdf don't quite fit.  One
possible way to consider is, in the context of the Harmony ABC model, to let
a DSpace Item be a 'Work'.  'Manifestations' are individual states of that
DSpace Item.  Then, part of the data in the history system when the
administrator corrects the DSpace Item would look like this:

hdl:1271.1/5678  ---- harmony:hasRealization ---->  XXXXX
XXXXX  ----- dc:title ----->  My Super Researck Paper

hdl:1271.1/5678  ---- harmony:hasRealization ---->  YYYYY
YYYYY  ----- dc:title ----->  My Super Research Paper

Now the relevant ABC bits and bobs can relate XXXXX and YYYYY, explaining
how we started with XXXXX and arrived at YYYYY.  (Maybe there's a more
appropriate part of the model than Work/Manifestation but this demonstrates
what is missing from the current model.)

Other gnarly questions are:

* Do individual Bitstreams and Bundles have separate states, or are they
married to Item states?  For example, if I change what is in a Bundle, does
that bundle get it's own states and transitions modelled, or do we just deal
with those events at the Item level?

* If a new Item is added to a Collection, does that mean the Collection has
changed and is now in a different state?  Ditto Community/Collection, ditto
Community/Collection/Item.  Gnarliest of gnarly things, if I change a
Bitstream that's in a Bundle that's in an Item that's in a Collection that's
in a Community, does that mean the Item's changed?  Does that Item changing
mean the Collection's changed?  Does that Collection changing then mean the
Community's changed?

I think we really need some example walkthroughs of typical events, with
graphs along the lines of Appendices C-F of
http://www.metadata.net/harmony/JODI_Final.pdf.  The way the work is
currently being discussed feels to me like we're examining individual trees
but have no idea what the forest looks like.

 Robert Tansley / Hewlett-Packard Laboratories / (+1) 617 551 7624
Received on Saturday, 17 May 2003 12:05:56 EDT

This archive was generated by hypermail pre-2.1.9 : Wednesday, 24 September 2003 13:35:22 EDT