W3C home > Mailing lists > Public > public-lod@w3.org > December 2009

Re: Lightweight RDF to Map Various Semantic Representations of Species

From: Peter DeVries <pete.devries@gmail.com>
Date: Tue, 8 Dec 2009 17:10:42 -0600
Message-ID: <3833bf630912081510p302e4a8t347d205c967e364b@mail.gmail.com>
To: Bernard Vatant <bernard.vatant@mondeca.com>, Leigh Dodds <leigh.dodds@talis.com>
Cc: public-lod <public-lod@w3.org>, dmozzherin <dmozzherin@gmail.com>
Thanks everyone, I am learning a lot throught this discussion :-)

There seems to be a common pattern emerging in the onlist and offlist
emails.

I am going to try to see if I understand these correctly, with examples.

It is probably best to think of several different representations of species
depending on how they will be used.

Below are the three different kinds of representations and some related
records.

Look close, because there are at least two quasi-humorous assertions. :-)

A SKOS Species Concept
  <txn:SpeciesConcept rdf:about="http://rdf.taxonconcept.org/ses/v6n7p">
    <skos:scopeNote>A SKOS concept for the species Puma concolor, used to
link documents about this species, via foaf:topic</skos:scopeNote>
    <txn:hasSpeciesConceptCode>urn:lsid:globalnames.org:
taxon:603bebac-cc44-4168-bbf7-b11b976f9d79</txn:hasGBIF_ID>
    <!-- Integer Keys in related databases -->
    <txn:hasGBIF>13815711</txn:hasGBIF>
    <txn:hasITIS>552479</txn:hasITIS>
    <txn:hasEOL>311910</txn:hasEOL>
    <txn:hasNCBI>9696</txn:hasNCBI>
    <txn:hasBOLD>12521</txn:hasBOLD>
    <!-- LSID Keys in related databases-->
    <txn:CoL_LSID_2009>urn:lsid:catalogueoflife.org:
taxon:dec52d72-29c1-102b-9a4a-00304854f820:ac2009<txn:CoL_LSID_2009>
  </txn:SpeciesConcept>
======================

An RDFs Class, subclass of Organism, (to cover Plants, Animals, Fungi etc)
  <geospecies:Species rdf:about="http://lod.geospecies.org/ses/v6n7p">
    <skos:scopeNote>An RDFs Class for the species Puma concolor, used to
link observations of individual animals, individual animals are
instances</skos:scopeNote>
    <txn:hasSpeciesConceptCode>urn:lsid:globalnames.org:
taxon:603bebac-cc44-4168-bbf7-b11b976f9d79</txn:hasGBIF_ID>
    <!-- Integer Keys in related databases -->
    <txn:hasGBIF>13815711</txn:hasGBIF>
    <txn:hasITIS>552479</txn:hasITIS>
    <txn:hasEOL>311910</txn:hasEOL>
    <txn:hasNCBI>9696</txn:hasNCBI>
    <txn:hasBOLD>12521</txn:hasBOLD>
    <!-- LSID Keys in related databases-->
    <txn:CoL_LSID_2009>urn:lsid:catalogueoflife.org:
taxon:dec52d72-29c1-102b-9a4a-00304854f820:ac2009<txn:CoL_LSID_2009>
    <!-- Thinking I should avoid too much subclassing, something like this
for hierarchy-->
    <geospecies:inFamily rdf:resource="
http://lod.geospecies.org/families/gSvIP"/>
    <geospecies:inOrder  rdf:resource="
http://lod.geospecies.org/orders/jtSaY#ofHerMajesty"/>
    <skos:broaderTransitive rdf:resource="
http://lod.geospecies.org/families/gSvIP"/>
  </geospecies:Species>

Examples: Individual Organism, an instance of IndividualOrganism and the
species v6n7p
  <geospecies:IndividualOrganism rdf:about="
http://lod.geospecies.org/inds/E7ed79e7-9436-458b-8c71-9765113f4c2e#specimen
">
    <skos:scopeNote>An instance/individual of this species</skos:scopeNote>
    <rdf:type rdf:resource="http://lod.geospecies.org/ses/v6n7p"/>
    <geospecies:catalogCode>PJDPO000177<geospecies:catalogCode>
    <geospecies:hasBasisOfRecord rdf:resource="
http://rdf.geospecies.org/ont/geospecies#StillImage"/>
    <geospecies:hasObservationRecord rdf:resource="
http://lod.geospecies.org/observations/a3c39dad-6756-4129-9a61-fbd32883d369
"/>
    <foaa:name>Bob<foaa:name>
    <skos:prefLabel>Individual Organism: Puma concolor
E7ed79e7-9436-458b-8c71-9765113f4c2e</skos:prefLabel>
    <skos:altLabel>Bob the Cougar</skos:altLabel>
    <geospecies:currentStatus rdf:resource="
http://rdf.geospecies.org/ont/geospecies#Current_Individual_Status_Presumed_Wild
"/>
   <geospecies:IndividualOrganism>

Examples: Individual Observation of a Species, an instance
  <geospecies:ObservationRecord rdf:about="
http://lod.geospecies.org/observations/a3c39dad-6756-4129-9a61-fbd32883d369#observation
">
    <skos:scopeNote>An instance of the class ObservationalRecord: who, what,
when, where, how</skos:scopeNote>
    <skos:prefLabel>Observation Record for Puma concolor
E7ed79e7-9436-458b-8c71-9765113f4c2e</skos:prefLabel>
    <geospecies:Observation_of_Species rdf:resource="
http://lod.geospecies.org/ses/v6n7p"/>
    <geospecies:Observation_of_Individual rdf:resource="
http://lod.geospecies.org/inds/E7ed79e7-9436-458b-8c71-9765113f4c2e#specimen
"/>
    <geospecies:determinationMethod rdf:resource="
http://rdf.geospecies.org/ont/geospecies#DeterminationMethod_Visual_Recognition
"/>
    <geospecies:observedBy rdf:resource="
http://rdf.geospecies.org/ont/people.owl#Jessica_Alba"/>
    <geospecies:identifiedBy rdf:resource="
http://rdf.geospecies.org/ont/people.owl#Peter_J_DeVries"/>
    <geospecies:hasBasisOfRecord rdf:resource="
http://rdf.geospecies.org/ont/geospecies#StillImage"/>
    <geospecies:atLocation rdf:resource="
http://lod.geospecies.org/locations/1bdde98c-2d38-4d20-8da0-19f1f187f31e"/>
    <!-- Need a good way to represent time, that works with instances and
intervals in imprecise records (not timestamp)-->
    <geospecies:observationTime>?</geospecies:observationTime>
    <geo:lat>44.862945</geo:lat>
    <geo:long>-87.231204</geo:long>
    <geo:accurateWithinMeters>30</geo:accurateWithinMeters>
    <geospecies:hasContinent rdf:resource="http://sws.geonames.org/6255149/
"/>
    <geospecies:hasCountry rdf:resource="http://sws.geonames.org/6252001/"/>
    <geospecies:hasStateProvince rdf:resource="
http://sws.geonames.org/5279468/"/>
    <geospecies:hasCounty rdf:resource="http://sws.geonames.org/5250768/"/>
    <dcterms:created>2009-12-04T13:29:33-0600</dcterms:created>
    <dcterms:modified>2009-12-04T13:29:33-0600</dcterms:modified>
  </geospecies:ObservationRecord>
===========


An RDF instance of class bdwg:Species
  <bdwg:Species rdf:about="http://lod.bdwg.org/ses/v6n7p">
    <skos:scopeNote>An RDF instance of the class bdwg:Species for the
species Puma concolor, for dealing with taxonomic hierarchy
datasets</skos:scopeNote>
    <txn:hasSpeciesConceptCode>urn:lsid:globalnames.org:
taxon:603bebac-cc44-4168-bbf7-b11b976f9d79</txn:hasGBIF_ID>
    <!-- Integer Keys in related databases -->
    <txn:hasGBIF>13815711</txn:hasGBIF>
    <txn:hasITIS>552479</txn:hasITIS>
    <txn:hasEOL>311910</txn:hasEOL>
    <!-- Warning!! two last integer keys, might not map as directly as might
be expected-->
    <txn:hasNCBI>9696</txn:hasNCBI>
    <txn:hasBOLD>12521</txn:hasBOLD>
    <!-- LSID Keys in related databases-->
    <txn:CoL_LSID_2009>urn:lsid:catalogueoflife.org:
taxon:dec52d72-29c1-102b-9a4a-00304854f820:ac2009<txn:CoL_LSID_2009>
  </bdwg:Species>

=============


Does it make sense to then have a machine readable RDF or OWL document that
describes the characteristics and attributes of a particular species?
One that does not try to answer what a "species" is, but documents "by this
identifier we mean an organism that matches these properties with some
percentage match".

Questions:

1) Do I seem to have understood the earlier comments correctly?

2) What are the valid links that should be made between these different
species representations.

3) How should people decide which version of species they should link to?

4) Does it make sense to then have a machine readable RDF or OWL document
that describes the characteristics and attributes of a particular species.
One that does not try to answer what a "species" is. but documents "by this
identifier we mean an organism that matches these properties with some
percentage match".


Thanks!

- Pete



On Tue, Dec 8, 2009 at 12:52 PM, Bernard Vatant
<bernard.vatant@mondeca.com>wrote:

> Hi Peter
>
> Answering to your post with a bit of delay, and apologies for this answer
> being longer than I expected, but that's what you get from waiting a week
> ... :)
> Mostly ignoring the rest of the thread, not that you did not get insightful
> answers so far, I would like to push a different approach to this issue of
> what I call *heterogeneous representations* of the same referent / thing /
> concept ... pick your choice. I liked very much the topic maps notion of
> "subject" in the sense of "subject of conversation" to indicate whatever we
> are trying to exchange about, hoping, but never sure (see Quine) that we are
> speaking about the same one.
> After this necessary caveat, by heterogeneous representations I mean
> (re)presentations coming from different perspectives, with different
> modeling options, each of one supposed to be fit to the purpose at hand.
> Because there is no representation without purpose. And there is no
> representation "better" than the other in an absolute way, no One Ring to
> rule them all etc.
>
> If we look at other levels of less formal representations, what do we see
> practically working? In the natural languages world, we have ages ago
> (except for some totalitarian dreamers) forgotten the idea of the unique
> language to rule them all. We have translations, and we have translators. At
> [1] you will find more about my current readings and thoughts about that
> translation paradigm. In data bases land we have schema conversions and data
> migration, in XML land we have XSLT. No One Schema to rule them all, the
> paradigm here again is translation.
>
> Why don't we do the same with ontologies? The model where the species Felis
> concolor is represented as an instance of a class "Species", and the model
> in which it is represented as a class, subclass of "Animal", and the model
> in which it is represented as a skos:Concept, are all pretty much useful,
> the first to deal with taxonomy evolution, the second to deal with
> observations in Wisconsin, and the third to deal with library classification
> of books about animals. We need all those, and certainly other ones.
>
> And yes linking them as different representations of somehow the same
> referent is important. But maybe mapping URIs using skos:match or owl:sameAs
> or owl:equivalentClass or anything of the kind is not a good idea. Because
> using any of those leads to mixing and blurring different languages,
> different schemas, different logics, making the global model difficult for
> humans to grasp conceptually and for computers to compute :)
> Neither do I believe any more (although I did until very recently) in any
> "central neutral representation", which I pushed here and there under the
> "hubject" meme.
>
> What I suggest hereafter is to port the translation paradigm in RDF land.
> What do we have in RDF land, similar to XSLT for XML? Well, the first coming
> to mind is SPARQL. Let me make an example.
>
> Suppose I have an ontology O1 where species are represented as instances of
> owl:Class, and an ontology O2 where species are represented as instances of
> skos:Concept. O1 and O2 leverage the same reference code, say ITIS.
>
> In O1 ITIS code is a datatype property o1:ITISCode which can be attached to
> any individual living creature, and a species class can be defined by
> containing all individuals sharing a given code, e.g.,
>
> o1:FelisConcolor a owl:Class
> owl:equivalentClass
>   [owl:Restriction
>     owl:onProperty  o1:ITISCode
>     owl:hasvalue  '552479' ]
>
> In O2 each concept is identified by the code using e.g., a  subproperty of
> skos:notation, say o2:ITISCode
>
> o2:FelisConcolor a  skos:Concept
> o2:FelisConcolor   o2:ITISCode  '552479'
>
> Note that sharing a code value is enough for mapping those two
> representations, for all pragmatic purpose. I can rely on SPARQL to find all
> concepts in O2 translating a class in O1 by the following request on the
> merged graph O1 U O2
>
> SELECT  ?cl  ?co
>
> WHERE  [
>               ?cl owl:equivalentClass  ?r.
>               ?r a owl:Restriction.
>               ?r  owl:onProperty   o1:ITIScode.
>               ?r  owl:hasValue   ?n.
>               ?co   a  skos:Concept.
>               ?co   o2:ITISCode  ?n.
>               ]
>
> Suppose now you want to translate a class subsumption in O1 into a
> broader-narrower (transitive) hierarchy in O2.
> There yo can use a SPARQL CONSTRUCT
>
> CONSTRUCT  [?co1   skos:broaderTransitive  ?co2]
>
> WHERE  [
>               ?cl1 owl:equivalentClass  ?r1.
>               ?r1 a owl:Restriction.
>               ?r1  owl:onProperty   o1:ITIScode.
>               ?r1  owl:hasValue   ?n1.
>               ?co1   a  skos:Concept.
>               ?co1   o2:ITISCode  ?n1.
>
>               ?cl2 owl:equivalentClass  ?r2.
>               ?r2 a owl:Restriction.
>               ?r2  owl:onProperty   o1:ITIScode.
>               ?r2  owl:hasValue   ?n2.
>               ?co2   a  skos:Concept.
>               ?co2   o2:ITISCode  ?n2.
>
>               ?cl1   rdfs:subClassOf   ?cl2.
>               ]
>
> Granted, it's a bit more intricated than a direct mapping link. But based
> simply on a shared code value, one can build correspondance tables and
> constructive queries. And you don't have to wonder any more what is the
> mysterious semantic nature of the link between the class there and the
> concept here. It is that you can translate assertions from there to
> assertions here.
>
> More thoughts on this "representation as translation" paradigm on my blog
> at [1], certainly more technical follow-up in the near future, so stay
> tuned.
>
> Cheers
>
> Bernard
>
> [1] http://blog.hubjects.com/2009/11/representation-as-translation.html
>
> 2009/11/30 Peter DeVries <pete.devries@gmail.com>
>
>> Hi LOD'ers :-)
>>
>> I am trying to work out some way to map the various semantic
>> representations for a species, in conjunction with a friendly three letter
>> organization.
>>
>> The goal of these documents is in part to improve "findability" of
>> information about species.
>>
>> The hope is that they will also help serve as a bridge from the LOD
>> to species information from the three letter organization and it's partners.
>>
>> The resources are mapped using skos:closeMatch.
>>
>> This should allow consumers to choose those attributes of each species
>> resource that they think are appropriate.
>>
>> It has been suggested to me that more comprehensive documents describing
>> species should be in the form of OWL documents, so I have included
>> nonfunctional links to these hypothetical resources.
>>
>> I have the following examples, and am looking for comments and
>> suggestions.
>>
>> RDF Example  http://rdf.taxonconcept.org/ses/v6n7p.rdf
>>
>> <http://rdf.taxonconcept.org/ses/v6n7p.rdf>Ontology
>> http://rdf.taxonconcept.org/ont/txn.owl
>>
>> <http://rdf.taxonconcept.org/ont/txn.owl>Ontology Doc
>> http://rdf.taxonconcept.org/ont/txn_doc/index.html
>>
>> VOID              http://rdf.taxonconcept.org/ont/void.rdf
>>
>> <http://rdf.taxonconcept.org/ont/txn_doc/index.html>I look forward to
>> your comments and suggestions, :-)
>>
>> - Pete
>> ----------------------------------------------------------------
>> Pete DeVries
>> Department of Entomology
>> University of Wisconsin - Madison
>> 445 Russell Laboratories
>> 1630 Linden Drive
>> Madison, WI 53706
>> GeoSpecies Knowledge Base
>> About the GeoSpecies Knowledge Base
>> ------------------------------------------------------------
>>
>
>
>
> --
> Bernard Vatant
> Senior Consultant
> Vocabulary & Data Engineering
> Tel:       +33 (0) 971 488 459
> Mail:     bernard.vatant@mondeca.com
> ----------------------------------------------------
> Mondeca
> 3, cité Nollez 75018 Paris France
> Web:    http://www.mondeca.com
> Blog:    http://mondeca.wordpress.com
> ----------------------------------------------------
>



-- 
----------------------------------------------------------------
Pete DeVries
Department of Entomology
University of Wisconsin - Madison
445 Russell Laboratories
1630 Linden Drive
Madison, WI 53706
GeoSpecies Knowledge Base
About the GeoSpecies Knowledge Base
------------------------------------------------------------
Received on Tuesday, 8 December 2009 23:11:26 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 December 2009 23:11:28 GMT