Comparing versions of SKOS terminologies

When a new version of, say, a thesaurus is published, user are interested in "What's new" and "What has changed?". I'm currently racking my brain about this. Has anyone solved the pretended-simple problem of  comparing two versions of a SKOS file, and the obviously not-so-simple one of formatting the output in a way that is intelligible?

When it comes down to diff RDF files, there are some solutions listed in http://www.w3.org/2001/sw/wiki/How_to_diff_RDF. The most simple way I found was using rdf.sh (https://github.com/seebi/rdf.sh), which simply system-diffs sorted .nt files produced by rapper. (You need to filter out blank nodes here, but this shouldn't be much of a problem with SKOS files.) Using git diff as a diff tool, this gives me a stat of something like "7443 insertions(+), 6937 deletions(-)" (on the two most recent versions of STW Thesaurus for Economics).

Obviously, this triple-level diff doesn't help much for the users. A possible way of action could be:

1) Group changes for each concept.
2) Recognize insertion and deletion of concepts as a whole (presumably the most important changes).
3) Recognize certain types of changes (e.g., altered prefLabel, added altLabel, changed relations).
4) Enrich the concept URIs with the preferred label (in a given language).
5) Arrange everything nicely on a RDFa overview page (additions/deletion of concepts, perhaps some of the more important types of changes, statistics such as amount of changed/unchanged concepts, etc.)
6) Provide change record (RDFa) pages per concept, which can be linked from a concept page.
7) Optionally, if the terminology includes meta-structures such as a term classification, add aggregated information about the most intensively changed subject areas to the overview page.

Thoughts? Has somebody done something similar already?

Cheers, Joachim

--
Joachim Neubert

ZBW - German National Library of Economics
Leibniz Information Centre for Economics
Neuer Jungfernstieg 21 
20354 Hamburg

Received on Tuesday, 27 August 2013 17:34:03 UTC