A look again at VRAcore

I took an action to look again at VRA core [0] based on our phone
conversation [1] and discussions around Mark's original [2].  This is an
incomplete pass over the VRA spec as I didn't want to dive into detail
before we get past the general principles.  In looking at it, I am taking
the point of view of modelling and reuse.  I am not undertaking a
domain-based analysis of the VRA core.  That requires a domain expert.

See also FRBR [3].

	Andy


Methodology
===========

I read the initial description and each definition of [0].  I ignored the
Dublin Core mappings for the moment.


Namespaces and URIs
===================

1/ The principle in allocating URLs is that the host/path name implies an
authority for a namespace.  We don't control http://www.vraweb.org/ so
either we ask them for a namespace or use our own.  I suggest the latter for
now.  http://web.mit.edu/simile/ would seem to be ours so:

Namespace: http://web.mit.edu/simile/2003/10/vraCore3#

which allows for versioning (we are learning - the vocabulary won't be right
first time).  This is the namespace for the first translation of VRA core
version3 to RDF.


2/ Controlled terms should have URIs.  Ideally, all controlled terms would
be URIs because they are globally unique names.  The Getty AAT term
"portrait" has a well defined meaning and is different from the orientation
of a sheet of paper.

It might be http://web.mit.edu/simile/2003/10/Getty/AAT#portraits (i.e. in a
place we control, not presume on Getty), then we can annotate the definition
(where it comes from, display name, comments).

Reuse
=====

In thinking about reuse, it seems to be better to pull out all the concepts
mentioned so that they can be referenced later from other corpuses.  

In designing for reuse, there is an infinite regression problem - we can
keep on splitting concepts into smaller and smaller pieces based on ever
increasing unlikely use cases.  We have to strike a balance.  The principles
I have used are what might be in the mind of the vocabulary designer and
what might be reused by a non-visual image vocabulary.  This is more about
Semantic Web issues that the specific demo but we do aim for reuse so
restriction just because the demo does not need it is contrary to the goals
of SIMILE.


Classes
=======

The concepts-as-classes I came up with are:

@prefix vra: <http://web.mit.edu/simile/2003/10/vraCore3#> .
@prefix person: <http://web.mit.edu/simile/2003/10/person#> .


vra:Record rdf:type rdfs:Class .

seems to have two meaning: a superclass/union of vra:Work and vra:Image and
as a specific a metadata record that is about a particular vra:Image or
vra:Work.  If we wish to consider provenance or tie to the History System we
may need both concepts as two separate classes.

(Warning : Web Architecture intrudes here!  See [4])

vra:Work rdf:type rdfs:Class .

The work is an abstraction: use a URN (or a URL with a hash on the end).  It
is not retrievable.

vra:Image rdf:type rdfs:Class .

Give this the URL of the primary resource/representation.  Can theoretically
GET this and display a JPEG.

person:Person

The concept of person - see also FOAF [5,6] and vCard [7].  We need to
record name (structured and display form) as well as the literal keys that
each corpus uses to refer to an individual person.

As other corpuses may not be purely about art work, it is the "person" that
is the concept, not creator (a role they play).  e.g. A book about Leonardo
da Vinci.

vra:Series

The TITLE term has qualifiers that record that the work/image is part of
something else.  This is best expressed as a separate concept and a
relationship (property) to connect them.

See email [12] forfurther discussion of "title".

vra:Material

This is the class of things that can be the range of a usesMaterial
relationship.  As we may wish to have information about the material (e.g. a
paint's PANTONE details) we need a 1st class concept.  This is a lesser
importance so we may wish to just use the controlled vocabulary literal for
now.  owl:hasValue would enable use to extend later.

vra:Location

Should relate to some geographic information vocabulary such as [8, 9] but
allow for looser descriptions as well.  We may wish to just record literal
values for now.

The LOCATION term needs a time component if our datasets use
Location.FormerSite/Repository as the value is only meaningful for a time
duration.

vra:LargerEntity

This is an undefined but referenced concept.  Rather than a class, I would
model it as a property relationship such as dc:isPartOf (a subproperty of
dc:relation).

Dates and Time
==============

This is complicated.  XML Schema has a time instance representation [10].
The RDF calendaring group [11] have looked at it as well but not a separate
vocabulary.

The vra qualified term Date.Creation allows a date range.  """ Dates may be
expressed as free text or numerical""".

Properties
==========

(This section is just some notes about some properties and will be expanded
as we nail down the concepts.)

Title.Variant and Title.Translation are alternative titles and so can be
subproperties of title.

Title.Series and Title.Larger Entity can not as their values are not titles
of the work/image but of the series or larger work.

For example: the title "Lord of the Rings Trilogy" is not a title for the
book "The Return of the King" in the original.

See also email [12].


vra:ID Number

"""A unique identifier assigned to a work or image."""  We should use URIs
as the primary reference to an image or work but need to record the 

This property is like constructing RDF with only bNodes and a well known
property "rdf:hasURI".

We also need to record the identifiers used by each corpus because such
identifiers will often have a local meaning (e.g. database key, file name,
physical location in storage).

Relation.Identity
Relation.Type

These are quite complicated and cover a many relationships but the formal
defintion is loose allowing datasets top use it in different ways.  Some
might be best modelled as properties (the type essentally derived from the
property URI and Identity the object of the statement).  Dublin Core picks
out some of these.


[0] http://www.vraweb.org/vracore3.htm
[1] http://www.w3.org/2003/10/09-simile-irc#T15-47-32
[2]
http://lists.w3.org/Archives/Public/www-rdf-dspace/2003Aug/att-0036/vraCore.
rdfs
[3] http://www.oclc.org/research/projects/frbr/
[4] http://www.w3.org/TR/webarch/
[5] http://www.foaf-project.org/
[6] http://xmlns.com/foaf/0.1/ 
[7] http://www.w3.org/TR/vcard-rdf
[8] http://www.w3.org/2003/01/geo/
[9] http://www.daml.org/2001/02/geofile/
[10] http://www.w3.org/TR/xmlschema-2/#dateTime
[11] http://www.w3.org/2002/12/cal/
[12] http://lists.w3.org/Archives/Public/www-rdf-dspace/2003Oct/0024.html

Received on Friday, 10 October 2003 11:14:07 UTC