RE: Scheme versioning & change management

Glad to see this issue getting aired. I'm not going to comment on the
options offered, but just raise the questions "Why does it matter?" "Why
would anyone want to know?" And I'll offer at least a couple of answers.

1. On encountering a thesaurus or portion of a thesaurus, one often
wants to know if this represents the current version. One wants to be
quite sure that these are the terms now considered valid for use -
either for indexing or for searching. In this circumstance, typically
one wants to identify which version the portion belongs to, and secondly
whether a later version exists. Depending on the findings, one might
then want to turn to the later version.

2. A slight variation on the above is, having found a thesaurus or
portion thereof, simply to identify which version it belongs to - either
the version number or the dates within which that version was current.
This need might occur in a housekeeping context rather than an
indexing/searching application.

3. A different context is when one is doing retrospective searches. One
wants to find the version of the thesaurus that was valid in a certain
date range, so as to identify the appropriate search terms in that
period. An alternative approach may be to ignore the version(s) but just
to check on particular thesaurus terms and their history notes - during
what period were Terms A and B valid, and before then what was the next
best thing?  I should confess that only a perfectionist searcher
actually does this, once in a blue moon; most people don't bother even
looking up the thesaurus in the first place. But they ought to. 

There may be other good reasons for wanting to check the versioning. But
the ones above already present several variables - are you doing it in
the course of vocabulary management, or in the course of running an
application for indexing/retrieval? And then, do the publishers allow
access to any version other than the current one? Do they have several
simultaneously online, or perhaps just the current one and if you are
very lucky you can download a text file with the archival versions.
Different authorities take different views as to what is best.

Sorry - no solutions there, just complications. But I hope it is useful
to clarify the need before trying to satisfy it.
Cheers
Stella

*****************************************************
Stella Dextre Clarke
Information Consultant
Luke House, West Hendred, Wantage, Oxon, OX12 8RR, UK
Tel: 01235-833-298
Fax: 01235-863-298
SDClarke@LukeHouse.demon.co.uk
*****************************************************



-----Original Message-----
From: public-esw-thes-request@w3.org
[mailto:public-esw-thes-request@w3.org] On Behalf Of Miles, AJ
(Alistair) 
Sent: 10 August 2004 16:23
To: 'public-esw-thes@w3.org'
Subject: Scheme versioning & change management



Hi all,

A request has come to me about how to handle periodic releases of a
thesaurus encoded in SKOS/RDF.  I do think we need to have at least a
basic framework of recommendations to handle this type of scenario
(which is very common), so have started writing up some ideas and
surveying some possibilities for support for existing vocabs ...

(This pasted from
<http://esw.w3.org/topic/SkosDev/SkosCore/SchemeVersioning>)


1 Vocabulary Support and Conventions for Scheme Versioning and Change
Management

Versioning and change management is a vital issue. There are a number of
vocabularies and de facto conventions that support bits and peices of
what's needed ... let's see if we can bring it all together into a
coherent and fairly complete framework ... 

(Basically this is a survey of change management and versioning features
from various vocabs, with some suggested usage scenarios and
applications, and also some new suggested terms ... all hypothetical and
suggestion). 

1.1 Versioning and Management of Concept Schemes

Scenario: An authority owns and manages a vocabulary. Although the
vocabulary is continuously evolving, the authority periodically releases
versions (snapshots) for their user community to work to. 

In this scenario I suggest the convention that a URI be defined to refer
to the scheme, and separate URIs are defined to refer to each version of
the scheme. E.g. (trivial example) ... 

http://example.org/myScheme 
http://example.org/mySchemeVersion1 
http://example.org/mySchemeVersion2 
http://example.org/mySchemeVersion3 

However, the base URI for all concept identifiers should not be altered
between versions. 

I.e. A set of concepts is defined and published. This set of concepts
are members of the base vocabulary. Additionally these concepts may or
may not be members of versions of the vocabulary, e.g. (examples in
RDF+turtle) ... 

@prefix ex: <http://example.org/>. 
@prefix skos: <http://www.w3.org/2004/02/skos/core#>. 
 
ex:conceptA  skos:inScheme  ex:myScheme; 
             skos:inScheme  ex:mySchemeVersion1; 
             skos:inScheme  ex:mySchemeVersion2. 
 
ex:conceptB  skos:inScheme  ex:myScheme; 
             skos:inScheme  ex:mySchemeVersion3. 

The [WWW]DCTerms vocabulary
<http://dublincore.org/documents/dcmi-terms/>
has properties that allow you to express the relationship between the
base vocabulary and vocabulary versions ... 

@prefix ex: <http://example.org/>. 
@prefix dct: <http://purl.org/dc/terms/>. 
 
ex:myScheme     dct:hasVersion  ex:mySchemeVersion1; 
                dct:hasVersion  ex:mySchemeVersion2; 
                dct:hasVersion  ex:mySchemeVersion3. 
 
ex:mySchemeVersion1     dct:isVersionOf ex:myScheme. 
 
ex:mySchemeVersion2     dct:isVersionOf ex:myScheme. 
 
ex:mySchemeVersion3     dct:isVersionOf ex:myScheme. 

The [WWW]OWL vocabulary <http://www.w3.org/TR/owl-guide/> has properties
that allow you to express relationships between scheme versions ... 

@prefix ex: <http://example.org/>. 
@prefix owl: <http://www.w3.org/2002/07/owl#>. 
 
ex:mySchemeVersion3     owl:priorVersion        ex:mySchemeVersion2. 
 
ex:mySchemeVersion2     owl:priorVersion        ex:mySchemeVersion1. 

OWL also has an annotation property allowing you to describe version
information as prose ... 

@prefix ex: <http://example.org/>. 
@prefix owl: <http://www.w3.org/2002/07/owl#>. 
 
ex:mySchemeVersion3  owl:versionInfo   
     'The following concepts have been added: x y z.  The following
concepts have been  
      deprecated: a b c. etc. ...'. 

DCTerms also has some basic properties allowing you to state the dates
at which a resource was created, issued (i.e. published) and modified
... 

@prefix ex: <http://example.org/>. 
@prefix dct: <http://purl.org/dc/terms/>. 
 
ex:mySchemeVersion3     dct:created     '2003-06-20'; 
                        dct:issued      '2003-08-04'; 
                        dct:modified    '2004-06-26'. 

1.2 Management of Concepts

Usually in the thesaurus world concepts go through a lifecycle in
relation to the schemes in which they are members: they begin as
candidates, then they are full members, and finally they may be dropped
(deprecated). 

This could be a requirement of SKOS Core to support this style of
concept management. To offer a suggestion, one way of doind this would
be to extend the skos:inScheme property, as in e.g. ... 

@prefix ex: <http://example.org/>. 
@prefix skos: <http://www.w3.org/2004/02/skos/core#>. 
@prefix poss: <http://example.org/skoscoresuggestions#>. 
 
ex:conceptA  poss:candidateInScheme  ex:mySchemeVersion1. 
 
ex:conceptA  skos:inScheme  ex:mySchemeVersion2. 
 
ex:conceptA  poss:deprecatedInScheme  ex:mySchemeVersion3. 

There is also the [WWW]vocab status vocabulary
<http://www.w3.org/2003/06/sw-vocab-status/ns#> which allows you express
the stability of an individual term in and RDF vocabulary (as one of
'stable' 'testing' and 'unstable'. This could also be used for SKOS
concept schemes, as in e.g. ... 

@prefix ex: <http://example.org/>. 
@prefix vs: <http://www.w3.org/2003/06/sw-vocab-status/ns#>. 
 
ex:conceptA     vs:term_status  'stable'; 
 
ex:conceptB     vs:term_status  'testing'; 
 
ex:conceptC     vs:term_status  'unstable'; 

... although how this should be combined with the
candidate/member/deprecated scheme status values (if they are used) is
not clear. 

The vocab status vocabulary also has another property, vs:moreinfo which
is designed to point to some prose describing the status of the term
further. 

As a convention for managing URIs for concepts, I suggest that once a
concept URI has been published, preference should always be given to
deprecating and replacing with a new concept, rather than altering the
concept. 

Usually, when a concept is dropped from a scheme, another concept or
combination of concepts is added to replace it. 

Where one concept has been replaced by another, the DCTerms vocab has
some properties that allow this relationship to be expressed e.g. ... 

@prefix ex: <http://example.org/>. 
@prefix dct: <http://purl.org/dc/terms/>. 
 
ex:conceptA     dct:isReplacedBy        ex:conceptC. 
 
ex:conceptC     dct:replaces            ex:conceptA. 

Where a concept has been replaced by a combination of concepts, some new
vocabulary may be required. I can imagine two cases: 

1.	where a concept should be replaced in metadata by EITHER one or
other of the targets. 
2.	where a concept should be replaced in metadata by BOTH of the
targets. 

... we could invent som enew vocab to cover these. The interesting use
scenario here is, if replacement rules are expressed in RDF, then
automated tools can be written to update metadata repositories, or at
least provide support to humans in that work.

---
Alistair Miles
Research Associate
CCLRC - Rutherford Appleton Laboratory
Building R1 Room 1.60
Fermi Avenue
Chilton
Didcot
Oxfordshire OX11 0QX
United Kingdom
Email:        a.j.miles@rl.ac.uk
Tel: +44 (0)1235 445440

Received on Tuesday, 10 August 2004 20:05:12 UTC