Re: Original History Implementation Considerations from John S. Erickson on 2003-05-23 (www-rdf-dspace@w3.org from May 2003)

From: John S. Erickson <john.erickson@hp.com>
Date: Fri, 23 May 2003 11:30:39 -0400
To: <www-rdf-dspace@w3.org>
Message-ID: <006e01c32140$45a6d1d0$7394190f@johnse3>

Mick passed along the History System Requirements document...

Mick, thanks for sending this out --- hopefully, it will help scope the
current effort.

An observation: the search for storage and operational efficiency for the
History System might give us the long-term answer to the "what should be
named" question, esp. if one of the long-term applications of the History
System is to facilitate state recovery.

Imagine the following operational policy:

* Every state of an item will be assigned a unique, persistent name

The implication of this is that an "item" is potentially a hierarchy (more
accurately, a network of relationships); while the higher-level name of the
item remains constant, each version of the item will be named and related to
the higher-level item.

This could be implemented in several ways; one is for the "high-level" item to
encapsulate all versions ( i.e. aggregrating a bunch of "version" properties
only), or alternatively also carrying the properties of the current version
(good ontology design would not allow a "versions" aggregator, akin to
collections in OOP; this is where object design and ontology design depart).

Note that this would accommodate fairly complex transitions unambiguously. For
example, when the first new version is created, a couple things happen:

* new handle is created; contents of old handle record (properties of old
version) copied to new handle record (this could be done by default to ensure
robustness of edit operation). Handles to child objects are preserved ---  old
state is immutable

* old handle record is modified, including new handles to new versions of
changed child objects

* property to newly-create old version is appended

Where am I going with this: in preserving state and a record of state change,
the goal should be to leverage the assumed archival efficiency of the storage
layer. The History System should only worry about referencing uniquely named
states in record changes, and to record those changes explicitly.

The History system would then unambiguously keep track of state changes, by
recording at the lowest level (like a change manifest) what I just described).
And of course, if there are higher-level semantics associated with these state
changes, they will decorate the record (for example, free-text explanation of
reason for change, etc, etc). Reference also my previous notes about richer
object models for curatorial activities.

Diffs between objects would be within an application layer sitting above all
of this.

Mick mentioned hases: Hashes could be collected within the history record, but
I don't think they are a reasonable alternative to unique identifiers (except
in the case that they *are* the local UID mechanism).

John

Received on Friday, 23 May 2003 11:32:27 UTC