$)CRe: Failing to meet integrity constraint S14 when terminology evolves

$)CHi Osma,

In my group, we are considering to formalize the terminology as version 
specific, e.g.
 Version-2013
   icd10gm2013:K12.23 a skos:concept;
           skos:prefLabel "Wangenabszess"@de;
           skos:altLabel "Wangenabsze),"@de.
 and
 Version-2014
   icd10gm2014:K12.23 a skos:concept;
           skos:prefLabel "Wangenabszess 2014"@de;
           skos:altLabel "Wangenabszess"@de;
           skos:altLabel "Wangenabsze),"@de.

Meanwhile we are also reluctant to do so because the ICD10 GM terminology 
publishes a new version each year, and there is only minor difference 
between each versions. It appears to be overkill to formalize each 
version.
We are now considering either formalize the terminology as version 
specific or use rdfs:label instead of skos:prefLabel. 
The purpose of my email is to seek for suggestions/comments from the 
community, and to check if there are better solutions.

PS, I have read your paper, a very interesting one. I have referenced the 
paper when writing the document of my SKOS mapping validation rules :) 
http://arxiv.org/ftp/arxiv/papers/1310/1310.4156.pdf

kind regards,
Hong




From:   Osma Suominen <osma.suominen@helsinki.fi>
To:     Hong Sun/AXIFX/AGFA@AGFA, public-esw-thes@w3.org
Date:   11/13/2013 11:08 AM
Subject:        Re: Failing to meet integrity constraint S14 when 
terminology evolves



Hi Hong!

Ah, I see. You are worrying about old versions of the vocabulary 
sticking around on the web, and their combination violating the 
constraint.

I don't think SKOS S14 was specified with this kind of scenario in mind. 
I think it only makes sense to interpret it in the context of a single 
version at a time. (The same applies for skosxl:prefLabel, as discussed 
in another subthread)

In general, if you merge old and new versions of an RDF dataset, and 
this breaks something (could be SKOS ICs, OWL axioms or whatever), then 
I think this is mostly your problem. Data from the web cannot in general 
be assumed to be well-formed. There have been many papers by e.g. Hogan 
and others demonstrating that RDF data from the web is generally pretty 
bad. Our paper about SKOS vocabulary quality contains some examples, and 
references to earlier studies on RDF data quality:

Assessing and Improving the Quality of SKOS Vocabularies.
Osma Suominen and Christian Mader. Journal on Data Semantics, 2013.
http://link.springer.com/article/10.1007%2Fs13740-013-0026-0

-Osma

On 13/11/13 11:43, Hong Sun wrote:
> Hi Osma,
>
> I took the assumption that once you published your ontology/terminology,
> it exists on the web, or even some local machines, as facts.
>
> Therefore, we have both
> Version-2013
>   icd10gm:K12.23 a skos:concept;
>           skos:prefLabel "Wangenabszess"@de;
>           skos:altLabel "Wangenabsze),"@de.
> and
> Version-2014
>   icd10gm:K12.23 a skos:concept;
>           skos:prefLabel "Wangenabszess 2014"@de;
>           skos:altLabel "Wangenabszess"@de;
>           skos:altLabel "Wangenabsze),"@de.
>
> It ends up with
>   icd10gm:K12.23 a skos:concept;
>           skos:prefLabel "Wangenabszess"@de;
>           skos:prefLabel "Wangenabszess 2014"@de;
>           skos:altLabel "Wangenabszess"@de;
>            skos:altLabel "Wangenabsze),"@de.
>
> Skosify could help to resolve local conflicts, however, it can not solve
> the conflicts of the published facts on the web (Because it does not
> have the right to drop the published fact icd10gm:K12.23 skos:prefLabel
> "Wangenabszess"@de.). Such conflicts to the published facts is what I
> considered as a problem.
>
> Thanks and best regards,
> Hong
>
>
>
>
> From: Osma Suominen <osma.suominen@helsinki.fi>
> To: public-esw-thes@w3.org
> Date: 11/13/2013 08:38 AM
> Subject: Re: Failing to meet integrity constraint S14 when terminology
> evolves
> ------------------------------------------------------------------------
>
>
>
> Hi Hong!
>
> I don't quite understand what the problem is in updating the prefLabel
> once more in 2014, and making the old labels altLabels. I think it is
> common practice with thesauri to have only one prefLabel per concept (as
> is formalized in SKOS S14), and if the prefLabel has to change, then the
> old label can be preserved as an altLabel.
>
> Skosify also does this when it detects S14 violations. One label will be
> kept as prefLabel (the policy can be selected) and the rest will be
> converted to altLabels. See 
http://code.google.com/p/skosify/wiki/Validation
>
> -Osma
>
> On 13/11/13 00:22, Hong Sun wrote:
>  > Thanks Johan!
>  >
>  > What I considered as a problem is that as the terminology is still
>  > evolving, the label of a code may change in future.
>  >
>  > For example, when the label in 2013 is "Wangenabszess", it is correct 
to
>  > formalize it as:
>  >
>  > icd10gm:K12.23 a skos:concept;
>  >          skos:prefLabel "Wangenabszess"@de;
>  >          skos:altLabel "Wangenabsze),"@de.
>  >
>  > But if the label is changed in future, e.g. in case it is changed as
>  > "Wangenabszess 2014" in the 2014 version, then I do not know what 
should
>  > I do,
>  >
>  > I would consider it as inappropriate to update the concept as
>  >
>  > icd10gm:K12.23 a skos:concept;
>  >          skos:prefLabel "Wangenabszess 2014"@de;
>  >          skos:altLabel "Wangenabszess"@de;
>  >          skos:altLabel "Wangenabsze),"@de.
>  >
>  > I consider this might be a common problem in using SKOS to formalize 
an
>  > evolving terminology. Do you have any suggestion?
>  >
>  > Kind regards,
>  > Hong
>  >
>  >
>  > -----"Johan De Smedt" <johan.de-smedt@tenforce.com>
> &#25776;&#20889;:-----
>  >   Hong Sun/AXIFX/AGFA@AGFA, <public-esw-thes@w3.org>
>  >   "Johan De Smedt" <johan.de-smedt@tenforce.com>
>  >   2013/11/12 y;gm09:57
>  >   RE: Failing to meet integrity constraint S14 when terminology 
evolves
>  >
>  > Hi Hong Sun,
>  >
>  > Managing the labels to get:
>  >
>  > icd10gm:K12.23 a skos:concept;
>  >          skos:prefLabel "Wangenabszess"@de;
>  >          skos:altLabel "Wangenabsze),"@de.
>  >
>  > Is a good approach.
>  >
>  > It is not clear what the problem is with this approach.
>  >
>  > Is the publication of  a version (2014) not the !0formalized 
terminology!1?
>  >
>  > Is this a SKOS problem or is the a publishing flow problem?
>  >
>  > Kind Regards,
>  >
>  > *Johan De Smedt *
>  >
>  > *From:*Hong Sun [mailto:hong.sun@agfa.com]
>  > *Sent:* Tuesday, 12 November, 2013 18:13
>  > *To:* public-esw-thes@w3.org
>  > *Subject:* Failing to meet integrity constraint S14 when terminology
> evolves
>  >
>  > Dear All,
>  >
>  > I have a problem in assigning labels to SKOS concepts within an 
evolving
>  > terminology, and am therefore looking for your opinions.
>  >
>  > In the ICD 10 coding system, Germany version, the text assigned to a
>  > code changes between different versions, e.g.
>  > in ICD10GM 2004, the code K12.23 has a label:Wangenabsze),
>  > in ICD10GM 2013, the code K12.23 has a label:Wangenabszess
>  >
>  > Before realizing the problem, I formalized the code as SKOS concept:
>  > icd10gm:K12.23 a skos:concept;
>  >          skos:prefLabel "Wangenabsze),"@de.
>  > However, it ends up with
>  > icd10gm:K12.23 a skos:concept;
>  >          skos:prefLabel "Wangenabsze),"@de;
>  >          skos:prefLabel "Wangenabszess"@de.
>  > which is not consistent with the integrity constraint S14.
>  >
>  > As the ICD 10 GM publish a new version each year, and most of the 
labels
>  > are stable, it also seems to be overkill to create a concept for each
>  > version, e.g.
>  > icd10gm2004:K12.23 a skos:concept;
>  >          skos:prefLabel "Wangenabsze),"@de.
>  > and
>  > icd10gm2013:K12.23 a skos:concept;
>  >          skos:prefLabel "Wangenabszess"@de.
>  >
>  > I also consider to take the labels from the latest version as 
prefLabel,
>  > and those from an older version as altLabel, e.g.
>  > icd10gm:K12.23 a skos:concept;
>  >          skos:prefLabel "Wangenabszess"@de;
>  >          skos:altLabel "Wangenabsze),"@de.
>  >
>  > The problem for this approach is that in case the code changes in 
later
>  > versions(e.g. v2014), then the skos:prefLabel needs to be updated 
again.
>  > If the formalized terminology is already published, then such request 
to
>  > update will be a problem.
>  >
>  > I currently planed to formalize the concept as below:
>  > icd10gm:K12.23 a skos:concept;
>  >          rdfs:label "Wangenabszess"@de;
>  >          rdfs:label "Wangenabsze),"@de.
>  >
>  > Still not very satisfied with this solution yet. Is there any better
>  > solution with other SKOS properties? Meanwhile, is there a general
>  > principle/guideline for SKOS in formalizing (the labels) of an 
evolving
>  > terminology? Thanks!
>  >
>  > Kind Regards,
>  > *
>  > Hong Sun | Agfa HealthCare*
>  > Researcher | HE/Advanced Clinical Applications Research
>  > T  +32 3444 8108
>  >
>
>
> --
> Osma Suominen
> D.Sc. (Tech), Information Systems Specialist
> National Library of Finland
> P.O. Box 26 (Teollisuuskatu 23)
> 00014 HELSINGIN YLIOPISTO
> Tel. +358 50 3199529
> osma.suominen@helsinki.fi
> http://www.nationallibrary.fi <http://www.nationallibrary.fi/>
>
>


-- 
Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Teollisuuskatu 23)
00014 HELSINGIN YLIOPISTO
Tel. +358 50 3199529
osma.suominen@helsinki.fi
http://www.nationallibrary.fi

Received on Wednesday, 13 November 2013 10:45:40 UTC