W3C home > Mailing lists > Public > public-bioschemas@w3.org > July 2018

Re: Bioschemas profile specification: multiple candidate terms from various ontologies

From: ljgarcia <ljgarcia@ebi.ac.uk>
Date: Wed, 04 Jul 2018 21:35:27 +0100
To: Franck Michel <franck.michel@cnrs.fr>
Cc: Carl Boettiger <cboettig@gmail.com>, public-bioschemas@w3.org
Message-ID: <c6429215651975932ec324a5962c480a@ebi.ac.uk>
Hi Franck,

I think we can use the DefinedTerm for mapping but I am not sure we need 
to go there. At least in the protein case, the Protein Ontology already 
support mappings to other well-known ontologies. Is it not similar for 
the taxon case?

I do not expect (yet?) the Bioschemas tools to support term 
replacements. I mean, if termA is the official one, I think that is the 
one that will be recognized by Bioschemas, regarding existing mappings 
to that term. Maybe Justin or Ricardo can provide more information in 
this regard.

Regards,


On 2018-07-02 09:20, Franck Michel wrote:
> Hi Carl,
> 
> Thanks for your response, I answer in the other thread with a relevant
> title.
> 
> Good point. As you suggest indeed, we would provide in the context the
> terms selected for each profile, while optional mappings would be
> provided within companion files like the schema.org shema.rdfa that
> you mention.
> 
> Regarding the syntax, I would say we can choose whatever RDF
> serialization we deem most appropriate (JSON-LD, Turtle...), possibly
> several. Below, I give an example context and mapping graph in
> JSON-LD.
> 
> Context:
> {  "@context": [
>         "http://schema.org/" [2],
>         {  "tc": "http://rs.tdwg.org/ontology/voc/TaxonConcept#" [3],
>             "tc:rank": { "@type": "@id" } }
>     ],
>     "tc:rank": "http://rs.tdwg.org/ontology/voc/TaxonRank#Species" [6]
> }
> 
> Mappings:
> {
>   "@context": {...},
>   "@graph": [
>      {  "@id": "tc:rank",
>         "owl:equivalentProperty":
> "http://www.wikidata.org/prop/direct/P105" [4],
>         "sameAs": "https://www.wikidata.org/wiki/Property:P105" [5]
>      },
>      ...
>    ]
> }
> 
> Note that the "profile mappings file" inherits the profile context.
> 
> Does this sound right to you guys?
> 
> Franck.
> 
> Le 29/06/2018 à 18:32, Carl Boettiger a écrit :
> 
>> Hi Franck,
>> 
>> I'm not an expert in OWL, but I don't think the context is the
>> correct place to define the OWL relationships.  If you consider the
>> examples of schema.org [1] itself, it does not define any
>> owl:equivalentProperty in the schema.org [1] definitions themselves,
>> but rather includes them in the offical schema.rdfa document where
>> the class definitions are also specified, e.g.
>> 
> https://github.com/schemaorg/schemaorg/blob/master/data/schema.rdfa#L4545
>> defines  that schema:description is the owl:equivalentProperty to
>> dc:description, just as you would expect.
>> 
>> I'm not actually sure if the bioschemas definitions are defined in
>> identical format to the schema.rdfa (personally I find rdfa hard to
>> read and tend to convert that to JSON-LD first for my own use
>> cases), but if so, it seems like that is where we should put the
>> owl:equivalentProperty statements, rather than in the context?
>> 
>> Note that the invalid error you see is not specific to sameAs or
>> owl:equivalentProperty; I believe a context cannot have any
>> 'non-special' terms.  Perhaps you want instead:
>> 
>> {
>> "@context": [
>> "http://schema.org/",
>> {
>> "tc": "http://rs.tdwg.org/ontology/voc/TaxonConcept#",
>> "wdt": "http://www.wikidata.org/prop/direct/",
>> "tc:rank": {
>> "@type": "@id",
>> "@id": "https://www.wikidata.org/wiki/Property:P105"
>> }
>> }
>> ],
>> "tc:rank": "http://rs.tdwg.org/ontology/voc/TaxonRank#Species"
>> }
>> 
>> Obviously this is a slightly different statement, since it does not
>> define a property and equivalent property, but literally uses the
>> wikidata URI as the expanded URI definition of "tc:rank".
> 
> Le 29/06/2018 à 12:28, Franck Michel a écrit :
> 
>> (hit the send button too fast: I renamed this one to split the
>> threads.)
>> 
>> WHAT DO WE DO WHEN THERE ARE MULTIPLE CANDIDATE TERMS FROM VARIOUS
>> ONTOLOGIES?
>> 
>> I guess there is a consensus here: each specification group proposes
>> a single approved term for each type and each property, that
>> represents the group consensus.
>> The only difficulty may be to reach this consensus while sparing
>> different communities' sensitivity.
>> 
>> It's been suggested that the context may document mappings to
>> equivalent terms in other ontologies. Although the idea is
>> compelling, I'm not sure how this can be achieved technically. I
>> have tried the [invalid] example below. Let us assume the
>> Biodiversity group chooses the TDWG rank property (tc:rank) to
>> denote the taxonomic rank. The context names this property while
>> providing the equivalent Wikidata property (owl:equivalentProperty)
>> and webpage thereof (remember that schema:sameAs gives the "URL of a
>> reference Web page that unambiguously indicates the item's identity
>> (...)".
>> {
>> "@context": [
>> "http://schema.org/" [2],
>> {
>> "tc": "http://rs.tdwg.org/ontology/voc/TaxonConcept#"
>> [3],
>> "tc:rank": {
>> "@type": "@id",
>> "owl:equivalentProperty":
>> "http://www.wikidata.org/prop/direct/P105" [4],
>> "sameAs":
>> "https://www.wikidata.org/wiki/Property:P105" [5]
>> }
>> }
>> ],
>> "tc:rank": "http://rs.tdwg.org/ontology/voc/TaxonRank#Species"
>> [6]
>> }
>> 
>> Unfortunately, this example is invalid [7] because a JSON-LD term
>> definition cannot contain a sameAs or owl:equivalent property.
>> 
>> Can you think of other ways to express this mapping?
>> 
>> Franck.
>> 
>> Le 28/06/2018 à 19:40, Justin Clark-Casey a écrit :
>> 
>> On Thu, 28 Jun 2018 at 16:42, ljgarcia <ljgarcia@ebi.ac.uk> wrote:
>> Hi,
>> 
>> What Melanie suggests is useful to describe profiles, they would
>> become
>> a DefinedTerm. That would help as well to avoid type/profile
>> confusion.
>> We would talk then about DefinedTerms. If we find a way to also
>> described the properties accepted with their restrictions, that
>> would be
>> even better. That might be a good subject for a different
>> discussion.
>> 
>> This means there will have to be special Bioschemas code that knows
>> to look in a DefinedTerm somewhere for this information.  I still
>> think using a subtype to signify a profile will be simpler.
>> 
>> I also disagree with Alasdair in that I think there should be a
>> http://bioschema.org/Protein type.  This would be an empty type that
>> just signifies we're talking about a Bioschemas defined protein. so
>> it isn't treading on anybodies toes.  This would have information
>> saying it's defined by http://purl.obolibrary.org/obo/PR_000000001
>> and it's same as terms.  Without this, there's not much point having
>> a bioschemas context, and requiring people to use this specific
>> string every time is cumbersome, especially if every group chooses
>> something from a different ontology.  This makes writing and
>> consuming markup harder.
>> 
>> The question remains. How do we choose a term over others to
>> associate
>> it to a profile/DefinedTerm?
>> 
>> I suggest having members of each specification group propose which
>> term they want and then come to consensus via discussion and/or
>> vote.
>> 
>> Regards,
>> 
>> On 2018-06-28 15:45, Melanie Courtot wrote:
>>> Hi,
>>> 
>>> We could consider using the defined terms,
>>> 
>> 
> https://dataliberate.com/2018/06/18/schema-org-introduces-defined-terms/,
>>> to do that.
>>> 
>>> So have a protein be defined as
>>> 
>>> "@type": "DefinedTerm",
>>> "@id": "http://purl.obolibrary.org/obo/PR_000000001",
>>> "name": "Protein",
>>> "inDefinedTermSet": "http://bioschemas.org/terms",
>>> "description": "An amino acid chain that is produced
>> de
>>> novo by ribosome-mediated translation of a genetically-encoded
>> mRNA.",
>>> "sameAs":
>> "http://purl.obolibrary.org/obo/NCIT_C17021",
>>> "sameAs":
>> "http://semanticscience.org/resource/SIO_010043"
>>> 
>>> (Using random examples of sameAs from
>>> https://www.ebi.ac.uk/ols/search?q=protein)
>>> 
>>> Cheers,
>>> Melanie
>>> 
>>> ---
>>> Melanie Courtot, PhD
>>> EMBL-EBI
>>> GA4GH/BioSamples project lead
>>> 
>>>> On 28 Jun 2018, at 15:18, ljgarcia <ljgarcia@ebi.ac.uk> wrote:
>>>> Hi,
>>>> 
>>>> I understood Franck's question in a different way.
>>>> 
>>>> Alasdair says
>>>> 
>>>>> I also agree that a context file should be provided which has
>> the
>>>>> chosen types and terms in it, i.e. the context file would define
>>>>> Protein to be the URI
>> http://purl.obolibrary.org/obo/PR_000000001.
>>>> 
>>>> I think what Franck is asking is how to choose
>>>> http://purl.obolibrary.org/obo/PR_000000001 over other possible
>>>> terms to define a Protein. For the taxon case, same as it happens
>>>> with proteins, there are multiple possibilities. Franck, is this
>>>> your question? If it is, I do not think there is any agreement on
>>>> how to choose, other than going for well-known ontologies broadly
>>>> accepted by the community of interest, even better if the term is
>>>> mapped to other possible ones.
>>>> 
>>>> Regards,
>>>> 
>>>> On 2018-06-28 11:50, Gray, Alasdair J G wrote:
>>>> On 27 Jun 2018, at 19:19, Justin Clark-Casey
>> <justinccdev@gmail.com>
>>>> wrote:
>>>> I think we should have mandatory known @types and properties.  In
>>>> my view, Bioschemas should be as easy as possible to write and
>>>> consume.  Multiple options will increase cognitive load on
>> writers
>>>> (which one do I choose?  Why are these 2 examples using these
>>>> different terms?) and open the door to greater inconsistency.
>>>> Non-mandatory types will also raise the barriers for writing
>>>> Bioschemas software that will have to be aware of equivalent
>>>> mappings.
>>>> I completely agree that we should have a single approved type for
>>>> each profile, and likewise for each property a single chosen
>> term.
>>>> This is the whole point of having the profiles.
>>>> I would go one step further and say that Bioschemas should
>> provide
>>>> an http://bioschemas.org [1] [1]context that will define types
>> such
>>>> as
>>>> Taxon, rather than blessing particular ontology terms.
>>>> I also agree that a context file should be provided which has the
>>>> chosen types and terms in it, i.e. the context file would define
>>>> Protein to be the URI
>> http://purl.obolibrary.org/obo/PR_000000001.
>>>> To
>>>> be completely explicit, we would not be defining a type in the
>>>> bioschemas namespace, e.g. http://bioschemas.org/Protein.
>>>> This context can also document equivalent terms in different
>>>> ontologies.
>>>> I like the idea that this also contains mappings to the
>> equivalent
>>>> terms in other ontologies.
>>>> Alasdair
>>>> Alasdair J G Gray
>>>> Fellow of the Higher Education Academy
>>>> Assistant Professor in Computer Science,
>>>> School of Mathematical and Computer Sciences
>>>> (Athena SWAN Bronze Award)
>>>> Heriot-Watt University, Edinburgh UK.
>>>> Email: A.J.G.Gray@hw.ac.uk
>>>> Web: http://www.macs.hw.ac.uk/~ajg33 [8]
>>>> ORCID: http://orcid.org/0000-0002-5711-4872
>>>> Office: Earl Mountbatten Building 1.39
>>>> Twitter: @gray_alasdair
>>>> Untitled Document
>>>> -------------------------
>>>> _HERIOT-WATT UNIVERSITY IS THE TIMES & THE SUNDAY TIMES
>>>> INTERNATIONAL
>>>> UNIVERSITY OF THE YEAR 2018_
>>>> Founded in 1821, Heriot-Watt is a leader in ideas and solutions.
>>>> With
>>>> campuses and students across the entire globe we span the world,
>>>> delivering innovation and educational excellence in business,
>>>> engineering, design and the physical, social and life sciences.
>>>> This email is generated from the Heriot-Watt University Group,
>> which
>>>> includes:
>>>> * Heriot-Watt University, a Scottish charity registered under
>>>> number
>>>> SC000278
>>>> * Edinburgh Business School a Charity Registered in Scotland,
>>>> SC026900. Edinburgh Business School is a company limited by
>>>> guarantee,
>>>> registered in Scotland with registered number SC173556 and
>>>> registered
>>>> office at Heriot-Watt University Finance Office, Riccarton,
>> Currie,
>>>> Midlothian, EH14 4AS
>>>> * Heriot- Watt Services Limited (Oriam), Scotland's national
>>>> performance centre for sport. Heriot-Watt Services Limited is a
>>>> private limited company registered is Scotland with registered
>>>> number
>>>> SC271030 and registered office at Research & Enterprise Services
>>>> Heriot-Watt University, Riccarton, Edinburgh, EH14 4AS.
>>>> The contents (including any attachments) are confidential. If you
>>>> are
>>>> not the intended recipient of this e-mail, any disclosure,
>> copying,
>>>> distribution or use of its contents is strictly prohibited, and
>> you
>>>> should please notify the sender immediately and then delete it
>>>> (including any attachments) from your system.
>>>> Links:
>>>> ------
>>>> [1] http://bioschemas.org/
>>> 
>>> 
>>> 
>>> Links:
>>> ------
>>> [1] http://bioschemas.org/
> 
> 
> 
> Links:
> ------
> [1] http://schema.org
> [2] http://schema.org/
> [3] http://rs.tdwg.org/ontology/voc/TaxonConcept#
> [4] http://www.wikidata.org/prop/direct/P105
> [5] https://www.wikidata.org/wiki/Property:P105
> [6] http://rs.tdwg.org/ontology/voc/TaxonRank#Species
> [7]
> https://json-ld.org/playground/#startTab=tab-expanded&amp;json-ld=%7B%22%40context%22%3A%5B%22http%3A%2F%2Fschema.org%2F%22%2C%7B%22tc%22%3A%22http%3A%2F%2Frs.tdwg.org%2Fontology%2Fvoc%2FTaxonConcept%23%22%2C%22wdt%22%3A%22http%3A%2F%2Fwww.wikidata.org%2Fprop%2Fdirect%2F%22%2C%22tc%3Arank%22%3A%7B%22%40type%22%3A%22%40id%22%2C%22owl%3AequivalentProperty%22%3A%22wdt%3AP105%22%2C%22sameAs%22%3A%22https%3A%2F%2Fwww.wikidata.org%2Fwiki%2FProperty%3AP105%22%7D%7D%5D%2C%22tc%3Arank%22%3A%22http%3A%2F%2Frs.tdwg.org%2Fontology%2Fvoc%2FTaxonRank%23Species%22%7D
> [8] http://www.macs.hw.ac.uk/%7Eajg33
Received on Wednesday, 4 July 2018 20:35:53 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 4 July 2018 20:35:53 UTC