Hi Kerstin,
Good to hear your thoughts about terminology mapping, and adopting
Nanopublications schema to deal with mapping provenance issues. Please
see my response in-line:
On 9/3/13 8:50 AM, Kerstin Forsberg wrote:
> Hi Eric,
> Thanks for the link to the RIM RDF tutorial, will read with great
> interest.
>
> I'll not able to join tomorrow. Two thoughts re. mappings.
>
> - How to align this with the interest in getting RDF (SKOS) versions
> directly from source (eg we have a good interaction with MedDRA MSSO
> about this).
>
> - Mapping provenance
> The justification and attribution of the mappings (between
> concept/terms) are key to trust them. At the ICBO conference earlier
> this summer we discussed the idea of turning for example the mappings
> in the Bioportal into Nanopublications based on some great work by Jim
> McCusker. So that the Bioportal mappings stated as skos:closeMatch
> also would have the justification of them as being the results of
> using the LOOM lexical algorithm. An alternative would be to treat
> mappings as linksets as done by Open PHACTS and provide the
> justification for the links/mappings (between entities) as part of the
> linkset description in VoID. Alasdair Gray is working on a nice
> proposal 1) on this based on the W3C HCLS task force for dataset
> discovery and description that Michel lead.
>
Here, it would be interesting to distinguish mappings into three
categories and their possible provenance measures:
1) Manually defined mappings: In the Nanopublications schema, provenance
is captures by the property nanopub:hasProvenance which ties to the
property nanopub:hasSupporting, which could capture the mappings
curation information (e.g. creator, author, version, rights etc)
2) (semi-)automatically found mappings: You have already discussed this
case above. So in the case, the information about the LOOM lexical
algorithm could be described using nanopub:hasSupporting property in
Nanopublications--or alternatively using the Open PHACTS approach ...
3) Inferred mappings via reasoning: New mappings can be inferred via a
reasoning process (i.e terminology reasoning). In this case, a reasoning
proof (i.e. set of inference steps under a rule-based reasoning) can
very well be suited to provide some provenance information.
Just to make my point clear, I would like to share an concrete case:
Test-case:
--------
ICD-9-CM code (999.4) <---exactMatch --> SNOMED-CT code (213320003)
<---exactMatch --> MedDRA code (10067113), for details see
term-mapping-example.png and example-term-map.n3.
Results:
-----
- ICD-9-CM code (999.4) <---exactMatch --> MedDRA code (10067113),
because skos:exactMatch is a transitive property.
A) The proof of this inferred mapping is shown in example-term-map-proof.n3
B) An abstract or summary of the reasoning results are shown in
example-term-map-ances.n3, which gives an overview information about
which of the asserted facts (i.e. asserted mappings) were used to derive
this inferred mapping.
C) Finally, an example Nanopublication describing this inferred mapping
is shown in example-term-map-nano.n3, where the reasoning information
from A) and B) are treated to provide some provenance information as two
supporting graphs ":NanoPub_1_Supporting_1" and
":NanoPub_1_Supporting_2". Interestingly :NanoPub_1_Supporting_2 can be
validated by a proof-checker--such as cwm
(http://www.w3.org/2000/10/swap/doc/cwm) or euler
(http://eulersharp.sourceforge.net/).
I plan to attend the COI call Wed 4 Sep.
Kind Regards,
Sajjad
*****************************************************
> On 3 sep 2013, at 07:39, Eric Prud'hommeaux <eric@w3.org
> <mailto:eric@w3.org>> wrote:
>
>> Emory proposed that we meet tomorrow at 7:30am US Eastern to make
>> progress on shared terminology mappings. That's 4:30 for west-coasters
>> so if anyone's attending from there, we can split the call into
>> 30 mins on term mapping when Emory can make it and
>> 30 minutes on outreach material at the regular time 3.5 hours later.
>>
>> Please reply with scheduling constraints and I'll do my best to
>> accomodate.
>>
>> minutes of prior meetings:
>> http://www.w3.org/2013/07/12-hcls-minutes
>> http://www.w3.org/2013/07/18-hcls-minutes
>> http://www.w3.org/2013/08/07-hcls2-minutes
>>
>> proposed agenda:
>> updates on collab between TAPS and SALUS on sharing terminology mappings
>> [Conor, Gökçe, Emory]
>> education/outreach material, e.g.
>> http://www.w3.org/2013/HCLS-tutorials/RIM/
>> http://www.w3.org/2013/C-CDA/IJ.xml
>> more stuff from SMART
>>
>> Please RSVP so I can get a head count for the bridge reservation.
>>
>> Call will be from 11:30-12:30 UTC (07:30 EDT) (13:30 CET)
>> using the Zakim Bridge: +1.617.761.6200, with
>> Conference Code: 4257 ("HCLS")
>> For text, we will use the IRC channel #hcls on irc.w3.org
>> <http://irc.w3.org> port 6665.
>> Please try out <http://irc.w3.org/> if you don't have an IRC client.
>>
>> --
>> -ericP
>>