Re: Lightweight RDF to Map Various Semantic Representations of Species from Bernard Vatant on 2009-12-08 (public-lod@w3.org from December 2009)

From: Bernard Vatant <bernard.vatant@mondeca.com>
Date: Tue, 8 Dec 2009 19:52:59 +0100
To: Peter DeVries <pete.devries@gmail.com>
Cc: public-lod <public-lod@w3.org>, dmozzherin <dmozzherin@gmail.com>
Message-ID: <9d93ef960912081052r6209cb8at298177952885bf3c@mail.gmail.com>
Hi Peter

Answering to your post with a bit of delay, and apologies for this answer
being longer than I expected, but that's what you get from waiting a week
... :)
Mostly ignoring the rest of the thread, not that you did not get insightful
answers so far, I would like to push a different approach to this issue of
what I call *heterogeneous representations* of the same referent / thing /
concept ... pick your choice. I liked very much the topic maps notion of
"subject" in the sense of "subject of conversation" to indicate whatever we
are trying to exchange about, hoping, but never sure (see Quine) that we are
speaking about the same one.
After this necessary caveat, by heterogeneous representations I mean
(re)presentations coming from different perspectives, with different
modeling options, each of one supposed to be fit to the purpose at hand.
Because there is no representation without purpose. And there is no
representation "better" than the other in an absolute way, no One Ring to
rule them all etc.

If we look at other levels of less formal representations, what do we see
practically working? In the natural languages world, we have ages ago
(except for some totalitarian dreamers) forgotten the idea of the unique
language to rule them all. We have translations, and we have translators. At
[1] you will find more about my current readings and thoughts about that
translation paradigm. In data bases land we have schema conversions and data
migration, in XML land we have XSLT. No One Schema to rule them all, the
paradigm here again is translation.

Why don't we do the same with ontologies? The model where the species Felis
concolor is represented as an instance of a class "Species", and the model
in which it is represented as a class, subclass of "Animal", and the model
in which it is represented as a skos:Concept, are all pretty much useful,
the first to deal with taxonomy evolution, the second to deal with
observations in Wisconsin, and the third to deal with library classification
of books about animals. We need all those, and certainly other ones.

And yes linking them as different representations of somehow the same
referent is important. But maybe mapping URIs using skos:match or owl:sameAs
or owl:equivalentClass or anything of the kind is not a good idea. Because
using any of those leads to mixing and blurring different languages,
different schemas, different logics, making the global model difficult for
humans to grasp conceptually and for computers to compute :)
Neither do I believe any more (although I did until very recently) in any
"central neutral representation", which I pushed here and there under the
"hubject" meme.

What I suggest hereafter is to port the translation paradigm in RDF land.
What do we have in RDF land, similar to XSLT for XML? Well, the first coming
to mind is SPARQL. Let me make an example.

Suppose I have an ontology O1 where species are represented as instances of
owl:Class, and an ontology O2 where species are represented as instances of
skos:Concept. O1 and O2 leverage the same reference code, say ITIS.

In O1 ITIS code is a datatype property o1:ITISCode which can be attached to
any individual living creature, and a species class can be defined by
containing all individuals sharing a given code, e.g.,

o1:FelisConcolor a owl:Class
owl:equivalentClass
  [owl:Restriction
    owl:onProperty  o1:ITISCode
    owl:hasvalue  '552479' ]

In O2 each concept is identified by the code using e.g., a  subproperty of
skos:notation, say o2:ITISCode

o2:FelisConcolor a  skos:Concept
o2:FelisConcolor   o2:ITISCode  '552479'

Note that sharing a code value is enough for mapping those two
representations, for all pragmatic purpose. I can rely on SPARQL to find all
concepts in O2 translating a class in O1 by the following request on the
merged graph O1 U O2

SELECT  ?cl  ?co

WHERE  [
              ?cl owl:equivalentClass  ?r.
              ?r a owl:Restriction.
              ?r  owl:onProperty   o1:ITIScode.
              ?r  owl:hasValue   ?n.
              ?co   a  skos:Concept.
              ?co   o2:ITISCode  ?n.
              ]

Suppose now you want to translate a class subsumption in O1 into a
broader-narrower (transitive) hierarchy in O2.
There yo can use a SPARQL CONSTRUCT

CONSTRUCT  [?co1   skos:broaderTransitive  ?co2]

WHERE  [
              ?cl1 owl:equivalentClass  ?r1.
              ?r1 a owl:Restriction.
              ?r1  owl:onProperty   o1:ITIScode.
              ?r1  owl:hasValue   ?n1.
              ?co1   a  skos:Concept.
              ?co1   o2:ITISCode  ?n1.

              ?cl2 owl:equivalentClass  ?r2.
              ?r2 a owl:Restriction.
              ?r2  owl:onProperty   o1:ITIScode.
              ?r2  owl:hasValue   ?n2.
              ?co2   a  skos:Concept.
              ?co2   o2:ITISCode  ?n2.

              ?cl1   rdfs:subClassOf   ?cl2.
              ]

Granted, it's a bit more intricated than a direct mapping link. But based
simply on a shared code value, one can build correspondance tables and
constructive queries. And you don't have to wonder any more what is the
mysterious semantic nature of the link between the class there and the
concept here. It is that you can translate assertions from there to
assertions here.

More thoughts on this "representation as translation" paradigm on my blog at
[1], certainly more technical follow-up in the near future, so stay tuned.

Cheers

Bernard

[1] http://blog.hubjects.com/2009/11/representation-as-translation.html

2009/11/30 Peter DeVries <pete.devries@gmail.com>

> Hi LOD'ers :-)
>
> I am trying to work out some way to map the various semantic
> representations for a species, in conjunction with a friendly three letter
> organization.
>
> The goal of these documents is in part to improve "findability" of
> information about species.
>
> The hope is that they will also help serve as a bridge from the LOD
> to species information from the three letter organization and it's partners.
>
> The resources are mapped using skos:closeMatch.
>
> This should allow consumers to choose those attributes of each species
> resource that they think are appropriate.
>
> It has been suggested to me that more comprehensive documents describing
> species should be in the form of OWL documents, so I have included
> nonfunctional links to these hypothetical resources.
>
> I have the following examples, and am looking for comments and suggestions.
>
> RDF Example  http://rdf.taxonconcept.org/ses/v6n7p.rdf
>
> <http://rdf.taxonconcept.org/ses/v6n7p.rdf>Ontology
> http://rdf.taxonconcept.org/ont/txn.owl
>
> <http://rdf.taxonconcept.org/ont/txn.owl>Ontology Doc
> http://rdf.taxonconcept.org/ont/txn_doc/index.html
>
> VOID              http://rdf.taxonconcept.org/ont/void.rdf
>
> <http://rdf.taxonconcept.org/ont/txn_doc/index.html>I look forward to your
> comments and suggestions, :-)
>
> - Pete
> ----------------------------------------------------------------
> Pete DeVries
> Department of Entomology
> University of Wisconsin - Madison
> 445 Russell Laboratories
> 1630 Linden Drive
> Madison, WI 53706
> GeoSpecies Knowledge Base
> About the GeoSpecies Knowledge Base
> ------------------------------------------------------------
>



-- 
Bernard Vatant
Senior Consultant
Vocabulary & Data Engineering
Tel:       +33 (0) 971 488 459
Mail:     bernard.vatant@mondeca.com
----------------------------------------------------
Mondeca
3, cité Nollez 75018 Paris France
Web:    http://www.mondeca.com
Blog:    http://mondeca.wordpress.com
----------------------------------------------------
Received on Tuesday, 8 December 2009 18:53:41 UTC