Re: Protein representation with a Bioschemas context () from Andra Waagmeester on 2017-11-15 (public-bioschemas@w3.org from November 2017)

From: Andra Waagmeester <andra@micelio.be>
Date: Wed, 15 Nov 2017 07:04:02 +0100
To: Leyla Garcia <ljgarcia@ebi.ac.uk>
Cc: Justin Clark-Casey <jc955@cam.ac.uk>, public-bioschemas@w3.org
Message-ID: <CAMNM0fX=LTcMwo_Dj1RA9q6GnGdQYyP3AXNY6b7HaXbc84ZqRQ@mail.gmail.com>
Hi Leyla,

     I think I like your suggestion which seems to allow reuse of external
IRIs. What I don't understand is why the IRIs will be agreed upon and then
fixed. Isn't this rather limiting the potential use cases?

Take for example the property "associatedDisease" which is now linked to "
http://semanticscience.org/resource/SIO_000983.rdf" which is labelled as
gene-disease association and not as protein-disease association. I
understand the rationale here, but pedantically speaking the association
here is quite implicit, since technically the disease association is with
an underlying gene and not the protein.  The point is that if the
bioschemas protein community agrees on this IRI to be explicitly linked to
the minted "associatedDisease"  we will not be able to use more expressive
properties if they exist or emerge.

Wouldn't the best option simply be to be strict on the type Protein, but
for the remaining properties use the complete ontological space out there,
without any limitations.

Andra




 "transcribedFrom" . With this choice, we only are able to map
protein-coding genes. So if we want to map non-protein coding genes, and
there is a
Gene entity introduced in bioschemas do we also introduce the property "
transribes"?

Likewise with the associatedDisease. Why is this not associatedPhenotype?





On Tue, Nov 14, 2017 at 2:57 PM, Leyla Garcia <ljgarcia@ebi.ac.uk> wrote:

> Hi,
>
> Nice to get that many comments!
>
> So, it looks like we are talking about something like
> https://github.com/BioSchemas/specifications/blob/master/Pro
> tein/examples/ProteinEntity-with-context.json where the context
> containing Gene and so will become the Bioschemas context and the IRIs will
> be agreed and then fixed. That example includes a third-party property
> which is always possible whenever schema.org or Bioschemas do not provide
> a better option.
>
> Regards
>
>
>
> On 14/11/2017 12:41, Justin Clark-Casey wrote:
>
>> I agree.  As Alasdair and Franck say, I feel that a major benefit of
>> schema.org is in providing agreed upon minimal terms that aid
>> findability.
>>
>> Pragmatically, data sources would always be free to use their own terms
>> and additionalTypes (I don't think that bioschemas can or should forbid
>> this), but they should be aware that there are agreed upon terms that will
>> make their data findable/usable by a distributed community, rather than
>> only by a few applications that are especially aware of their markup.
>>
>> I also agree with Stephen that relying on a central collator is too much
>> overhead.  To me, this introduces a single point of failure that conflicts
>> with the spirit of the web.
>>
>> -- Justin Clark-Casey
>>
>> On 14/11/17 12:02, Gray, Alasdair J G wrote:
>>
>>> Dear All,
>>>
>>> I think Franck’s email clearly explains the situation here.
>>>
>>> Schema.org <http://schema.org> is about everyone buying in to use a
>>> common set of terms to markup their content. If they buy-in to that then
>>> they get the benefit. Otherwise you are just on the linked data web.
>>>
>>> Bioschemas is about making Schema.org <http://schema.org> relevant for
>>> the life sciences. We have agreed as a community that we prefer to reuse an
>>> existing ontology term than mint our own. However, to me, it means that we
>>> do need to select a single ontology term. It is through this agreement that
>>> we will see benefit whilst also keeping the route to adoption
>>> straightforward.
>>>
>>> Alasdair
>>>
>>> On 14 Nov 2017, at 10:21, Franck Michel <franck.michel@cnrs.fr <mailto:
>>>> franck.michel@cnrs.fr>> wrote:
>>>>
>>>> Dear all,
>>>>
>>>> I'd like to bring a few elements into the discussion wrt. aliases.
>>>>
>>>> In JSON-LD, aliases are just a handy short-cut notation with a local
>>>> scope: an alias just applies within the scope of the context where it is
>>>> defined. And more importantly, an alias should not bear any meaning. The
>>>> first thing a consumer app does with JSON-LD is to expand all terms, which
>>>> immediately removes all aliases.
>>>>
>>>> Hence, if I use theBioschemas.org <http://bioschemas.org/>default
>>>> context:
>>>> @context {  "Gene": { "@id":"http://purl.obolibrary.org/obo/SO_0000704"}
>>>> ... }
>>>> I will typically write:  "@type": [ "BioChemEntity", "Gene" ]
>>>>
>>>> But I may well write a document with a custom alias:
>>>> @context {  "GeneAlias": { "@id":"http://purl.obolibrary.
>>>> org/obo/SO_0000704"} ... }
>>>> and write:   "@type": [ "BioChemEntity", "GeneAlias" ]
>>>> With:
>>>> @context {  "obo": {  "@id":"http://purl.obolibrary.org/obo/"} ... }
>>>> I would write:   "@type": [ "BioChemEntity", "obo:SO_0000704" ]
>>>> Or I could even not use any alias:   "@type": [ "BioChemEntity","
>>>> http://purl.obolibrary.org/obo/SO_0000704"]
>>>>
>>>> These are all equivalent from the point of view of a data consumer.
>>>>
>>>> In my view, the default context should be a useful guide for those
>>>> annotating data withBioschemas.org <http://bioschemas.org/>markup, but
>>>> alias names should not matter at all. What matters is the URIs to which
>>>> aliases resolve.
>>>>
>>>> I feel like the solution of agreed pre-defined URIs, whatever the
>>>> aliases used, is more sustainable. After all,schema.org <
>>>> http://schema.org/>advocates for the use of specific agreed-upton
>>>> terms. If one uses them, their pages are more likely to be discoverable.
>>>> They can chose to use other terms if this is convenient for them, but then
>>>> there is not guarantee that the pages will be discovered as easily.
>>>>
>>>> Franck.
>>>>
>>>
>>> Alasdair J G Gray
>>> Fellow of the Higher Education Academy
>>> Assistant Professor in Computer Science,
>>> School of Mathematical and Computer Sciences
>>> (Athena SWAN Bronze Award)
>>> Heriot-Watt University, Edinburgh UK.
>>>
>>> Email: A.J.G.Gray@hw.ac.uk <mailto:A.J.G.Gray@hw.ac.uk>
>>> Web: http://www.macs.hw.ac.uk/~ajg33
>>> ORCID: http://orcid.org/0000-0002-5711-4872
>>> Office: Earl Mountbatten Building 1.39
>>> Twitter: @gray_alasdair
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> Untitled Document
>>> ------------------------------------------------------------
>>> ------------------------------------------------------------
>>> ----------------------------------------
>>>
>>> */Heriot-Watt University is The Times & The Sunday Times International
>>> University of the Year 2018/*
>>>
>>> Founded in 1821, Heriot-Watt is a leader in ideas and solutions. With
>>> campuses and students across the entire globe we span the world, delivering
>>> innovation and educational excellence in business, engineering, design and
>>> the physical, social and life sciences.
>>>
>>> This email is generated from the Heriot-Watt University Group, which
>>> includes:
>>>
>>>  1. Heriot-Watt University, a Scottish charity registered under number
>>> SC000278
>>>  2. Edinburgh Business School a Charity Registered in Scotland,
>>> SC026900. Edinburgh Business School is a company limited by guarantee,
>>> registered in Scotland
>>>     with registered number SC173556 and registered office at Heriot-Watt
>>> University Finance Office, Riccarton, Currie, Midlothian, EH14 4AS
>>>  3. Heriot- Watt Services Limited (Oriam), Scotland's national
>>> performance centre for sport. Heriot-Watt Services Limited is a private
>>> limited company
>>>     registered is Scotland with registered number SC271030 and
>>> registered office at Research & Enterprise Services Heriot-Watt University,
>>> Riccarton, Edinburgh,
>>>     EH14 4AS.
>>> The contents (including any attachments) are confidential. If you are
>>> not the intended recipient of this e-mail, any disclosure, copying,
>>> distribution or use of its contents is strictly prohibited, and you should
>>> please notify the sender immediately and then delete it (including any
>>> attachments) from your system.
>>>
>>>
>>
>
>
Received on Wednesday, 15 November 2017 06:05:27 UTC