Comments on DSpace History System: RDF Schema Design

Hi,

Here are my initial comments on the RDF schema design doc.  I should be able to make the call tomorrow a.m. after all.  Sorry if the notes below are rambly and confusing, but this stuff is so massively complicated that it might take me a long time to get together a more polished version of the comments.  Hopefully I can make things clear in the call.


* Has anyone checked whether the standard DSpace DC type registry includes element/qualifiers that aren't represented in http://dublincore.org/documents/dcmi-terms/?  Given that the DC type registry is configurable, can we rely on/require that everything in the type registry does come from the list at that URL?


* Eperson Groups:  I've thought about the Eperson Group thing.  There are two approaches to modelling eperson groups:  have an EPersonGroup class that is related to Eperson objects via a 'hasPart' property.  This is more consistent with the way the rest of the system is modelled.  The other way would be to just represent EPersonGroups as rdf:bags (ordering etc. is not important).  Maybe the first would be better for consistency.


* Figure 2: DSpace Object Model

Collection class:  approver, reviewer and (missing) editors may or may not be present.  The range of these are EPersonGroups.  Perhaps more importantly, the names 'approver', 'reviewer' and 'editor' might not be ideal.  Internally in the system, each of the three workflow steps are just called workflow step 1, workflow step 2 etc.  The fact that workflow step 1 = 'reviewer' is really just a UI thing.  As the workflow system evolves, the number of steps may increase and the names change.  Maybe it would be better to use 'workflowStep1' as a property name, or perhaps 'workflowStep' with some literal property indicating the step number (not sure how you'd do this in RDF?)

On the other hand, as long as the use of approver, reviewer and editor is well defined, given that the schema will have to be revised to accommodate future workflow schemes, perhaps schema versioning means those names will be OK to use.

Item class:  'accessioned' is part of the Dublin Core metadata; is it in the Item class as an example or because it is somehow different from the other DC metadata?

Why does the Item class have a property 'reviewer'?  Aren't reviewers Agents that are associated with the an event, rather than a property of the Item?  Items in the DSpace database do have 'submitter' properties, but not 'reviewer'.

I don't think it's really appropriate to have WorkflowItems and WorkspaceItems with 'isPartof' properties indicating the Collecion.  They have a target collection, but they are not 'in' that collection yet since they haven't been approved or accessioned.  I don't know if an existing ABC or DC property would cover this.

Bitstream: missing 'size' and 'user format description' properties (see DSpace RDBMS schema)


* 3.3.3 History System Elements:

accessioned, author:  These are Dublin Core fields, shouldn't they be in that namespace?

copyright:  Copyright is a property of Collection, not Item. It's just some text that's displayed on the Collection home page in the UI.

hasApprover, hasReviewer: as explained above, isn't the participation of epeople in workflow expressed in the ABC agent/action/event model?  Why do we need thse properties?

generated/hasGenerator:  Don't understand why this is here?

introduction:  I assume this corresponds to 'introductory_text' from the DSpace DB schema; why aren't the same names being used?

license:  This is a property of Collection, which is the text of the license that submitters to a collection must grant in order to complete the submission process.  Licenses relating to individual items are held in Bitstreams within those Items.

firstname/lastname are used for epeople names, 'name' is a property of Community and Collection only.

provenance:  In the case of an Item, provenance is held within the Dublin Core metadata.  'provenance' is a property with range literal for Communities and Collections.


* Appendix A: Example

The examples are rather simple, we need some more complex examples, I think.  e.g. what happens if a Bundle is removed?  I think this will raise some issues not addressed anywhere in the current document.


 Robert Tansley / Hewlett-Packard Laboratories / (+1) 617 551 7624

Received on Tuesday, 3 June 2003 17:50:58 UTC