Concept Scheme Versioning


I'm having a problem with the SKOS Core Guide recommendation for Concept Scheme Versioning [1].  The Guide recommends (maybe that's a little strong since this section is under the Open Issues heading) that each expression (version) of the controlled vocabulary have a skos:ConceptScheme and each concept in the controlled vocabulary use a skos:inScheme property to relate which skos:ConceptScheme it belongs to.  The URI for each skos:ConceptScheme must be unique and the example indicates that each skos:Concept URI must be the same across expressions of the controlled vocabulary.

Where I'm having a problem, when I try to apply SKOS to controlled vocabularies, is in multiple areas: RDF and URI's.  I'm not having a problem with the skos:ConceptScheme part, but the use of skos:inScheme for a skos:Concept and the concept's URI.  Each term in a controlled vocabulary is defined by its scope notes, hierarchal relationships.  Lets say that I have a thesauri that is issued yearly for the years 2003, 2004, and 2005.  As part of my SKOS conversion, I decide to put each version in a separate document instance. In each of the document instances, e.g., 2003.skos, 2004.skos, 2005.skos, I create a skos:ConceptScheme element with a unique URI across all document instances.  This is straight forward.

However, when it comes to defining the concepts in each of the document instances, this is where I start to have some concerns.  The Guide indicates that a common URI be used for concepts across all document instances.  My understanding of RDF is that when each of my document instances is placed in an RDF triple store, RDF will merge the skos:Concept elements that have the same URI.  My dilemma is using a non-versioned URI for the skos:Concept elements will merge the concepts.  This may not be what you want.

(1) Lets say that in the 2003 version of the thesauri you define a concept with a certain scope, but latter in the 2005 version you restrict the scope.  When RDF merges the concepts from the 2003 and 2005 document instances, you will have conflicting scope notes.  What happens when someone in 2003 assigns this concept to a resource and someone else processes it in 2005 against the 2005 version?  Since the scope changed, but the URI is the same in the 2003 and 2005 versions how does the person processing it against the 2005 version know whether the resource is in scope or out of scope?

(2) Lets say that in the 2005 version of the thesauri you define a concept with a certain scope, but latter in the 2005 version the concept is removed.  When RDF merges the concepts from the 2003 and 2005 document instances, you *may* have conflicting properties depending upon how you indicated the removal of the concept in the 2005 version.

(3) Lets say that in the 2003 version of the thesauri you define a concept with a certain scope, but latter in the 2005 version you change the hierarchal relationships.  When RDF merges the concepts from the 2003 and 2005 document instances, you have a mixture of scope notes and relationship properties, e.g., NT, BT, RT, etc.  How do you know what the true definition of the concept is?

It seems to me that you really want each concept, in each expression (version) of the thesauri, to have a version specific URI.  Granted there will be common concepts between expressions of a thesauri that are defined with the same scope and relationships, but there should be some mechanism, possible OWL, to indicate which ones are the same or different and exactly how they are different.

I would appreciate peoples comments on whether I have misunderstood something and whether my concerns are valid or not and why not.


Thanks, Andy.


[1] http://www.w3.org/TR/2005/WD-swbp-skos-core-guide-20050510/#secschemeversioning

Received on Friday, 15 July 2005 15:06:11 UTC