Re: Comparing versions of SKOS terminologies

On 27/08/13 20:33, Neubert Joachim wrote:
> When a new version of, say, a thesaurus is published, user are interested in "What's new" and "What has changed?". I'm currently racking my brain about this. Has anyone solved the pretended-simple problem of  comparing two versions of a SKOS file, and the obviously not-so-simple one of formatting the output in a way that is intelligible?
>
> When it comes down to diff RDF files, there are some solutions listed in http://www.w3.org/2001/sw/wiki/How_to_diff_RDF. The most simple way I found was using rdf.sh (https://github.com/seebi/rdf.sh), which simply system-diffs sorted .nt files produced by rapper. (You need to filter out blank nodes here, but this shouldn't be much of a problem with SKOS files.) Using git diff as a diff tool, this gives me a stat of something like "7443 insertions(+), 6937 deletions(-)" (on the two most recent versions of STW Thesaurus for Economics).
>
> Obviously, this triple-level diff doesn't help much for the users. A possible way of action could be:
>
> 1) Group changes for each concept.
> 2) Recognize insertion and deletion of concepts as a whole (presumably the most important changes).
> 3) Recognize certain types of changes (e.g., altered prefLabel, added altLabel, changed relations).
> 4) Enrich the concept URIs with the preferred label (in a given language).
> 5) Arrange everything nicely on a RDFa overview page (additions/deletion of concepts, perhaps some of the more important types of changes, statistics such as amount of changed/unchanged concepts, etc.)
> 6) Provide change record (RDFa) pages per concept, which can be linked from a concept page.
> 7) Optionally, if the terminology includes meta-structures such as a term classification, add aggregated information about the most intensively changed subject areas to the overview page.
>
> Thoughts? Has somebody done something similar already?

Hi Joachim!

The MUTU tool [1] developed within the FinnONTO project does something 
pretty similar - it compares lightweight ontologies (structurally very 
similar to SKOS) and outputs a human-compatible report of their 
differences. The intended use case may be slightly different from your 
scenario - it has been used for collaborative development of linked 
ontologies where changes in the top-level ontology (YSO) affect the 
domain-specific ontologies. However, the questions "what's new" and 
"what has changed" are pertinent in MUTU as well.

Alas the tool has not been released for public use, but at least there 
are some publications (see [1]). Also I'm cc'ing Sini, she could tell 
you more about the tool and perhaps share some insight.

-Osma


[1] http://www.seco.tkk.fi/tools/mutu/

-- 
Osma Suominen
Information Systems Specialist
National Library of Finland
P.O. Box 26 (Teollisuuskatu 23)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suominen@helsinki.fi
http://www.nationallibrary.fi

Received on Wednesday, 28 August 2013 08:05:29 UTC