W3C home > Mailing lists > Public > public-bioschemas@w3.org > November 2017

Re: Protein representation with and without BioChemEntity

From: Melanie Courtot <mcourtot@ebi.ac.uk>
Date: Thu, 2 Nov 2017 10:34:41 +0000
To: "Gray, Alasdair J G" <A.J.G.Gray@hw.ac.uk>
Cc: Justin Clark-Casey <justinccdev@gmail.com>, "public-bioschemas@w3.org" <public-bioschemas@w3.org>
Message-ID: <f3a9ec7f-33ed-bcf7-5df6-1d8c733525c9@ebi.ac.uk>
Hi Alastair,

I'm not sure I understand, are you talking about validating profiles? 
For validation purpose wouldn't it be equivalent to look for a string 
such as "isContainedIn" or a URI such as 
http://purl.obolibrary.org/obo/RO_0001018, where the latter has the 
advantage to not duplicate existing terms as well as offering 
dereferencing, hierarchy and flexibility?

I'll admit being a bit confused between the 3 examples, so I may be 
overlooking something.

Cheers,
Melanie




On 02/11/2017 10:18, Gray, Alasdair J G wrote:
> Hi Melanie,
>
> While we are free to pick terms from any ontology, it is important 
> that as a community we select the terms that we are going to use in 
> each specific profile. This means that the tooling to use these terms 
> known what they are looking for.
>
> Best regards
>
> Alasdair
>
>> On 2 Nov 2017, at 10:02, Melanie Courtot <mcourtot@ebi.ac.uk 
>> <mailto:mcourtot@ebi.ac.uk>> wrote:
>>
>> Hi,
>>
>> I'm wondering if we could take it a step further, and instead of 
>> defining specific properties we could just reuse terms from RO (or 
>> else)?
>>
>> For example, 
>> "http://semanticscience.org/resource/is-transcribed-from" could be 
>> replaced by http://purl.obolibrary.org/obo/RO_0002510, "transcribed 
>> from", and "isContainedIn" could be 
>> http://purl.obolibrary.org/obo/RO_0001018, "contained in".
>>
>> Cheers,
>> Melanie
>>
>> -- 
>> Mélanie Courtot, PhD
>> GA4GH/BioSamples Project lead
>> European Bioinformatics Institute (EMBL-EBI)
>>
>>
>> On 01/11/2017 16:18, Justin Clark-Casey wrote:
>>> Direct term reuse sounds like a good choice to me, especially as
>>>
>>> a) it's the mechanism that schema.org <http://schema.org/> 
>>> themselves have to add existing ontology classes and terms to the 
>>> structured data
>>> b) will make applications much easier to write as they can use 
>>> existing general tooling
>>> c) allows us to do everything we were doing with AdditionalProperty and
>>> d) still allows us to define profiles without having to move 
>>> everything through schema.org <http://schema.org/>
>>>
>>> -- Justin
>>>
>>> On Wed, Nov 1, 2017 at 3:56 PM, Gray, Alasdair J G 
>>> <A.J.G.Gray@hw.ac.uk <mailto:A.J.G.Gray@hw.ac.uk>> wrote:
>>>
>>>     Hi All,
>>>
>>>     Apologies for the delay in sending this email. I have been
>>>     working with Carole on submitting an Implementation Study
>>>     proposal to the Data Platform for more work on Bioschemas.
>>>
>>>     For representing a specific bioscience type, e.g. a protein, we
>>>     currently have a proposal for using a generic wrapper approach
>>>     that we then specialise, e.g. BioChemEntity specialised with a
>>>     Protein profile.
>>>
>>>     Protein profile
>>>     http://bioschemas.org/specifications/Protein/specification/
>>>     <http://bioschemas.org/specifications/Protein/specification/>
>>>     BioChemEntity type
>>>     http://bioschemas.org/specifications/BioChemEntity/specification/
>>>     <http://bioschemas.org/specifications/BioChemEntity/specification/>
>>>
>>>     To help understand the various advantages and disadvantages of
>>>     this approach, Kenneth and I have drawn up an example of marking
>>>     up a specific protein first using the current proposal and
>>>     second  if we were to do the same with a specific ProteinEntity.
>>>     Below are the examples and some analysis of them.
>>>
>>>     *BioChemEntity Example*
>>>     Minimum markup using BioChemEntity
>>>     https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/BioChemEntity-min.jsonld
>>>     <https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/BioChemEntity-min.jsonld>
>>>
>>>     Minimum + Recommended markup using BioChemEntity
>>>     https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/BioChemEntity-min%2Brec.jsonld
>>>     <https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/BioChemEntity-min+rec.jsonld>
>>>
>>>     One thing to note is that the minimum + recommended markup is
>>>     not an additive extension of the minimum markup. Due to the use
>>>     of the AdditionalProperty relationship, you need to use an JSON
>>>     array and add the properties from the recommended level within
>>>     the existing array.
>>>
>>>     An advantage of this approach is that it reuses terms from
>>>     existing ontologies and we can represent types that do not
>>>     currently exist in Schema.org <http://schema.org/>, e.g. Genes,
>>>     Chemicals, etc.
>>>
>>>     *ProteinEntity example*
>>>     Minimum markup using ProteinEntity
>>>     https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/ProteinEntity-min.jsonld
>>>     <https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/ProteinEntity-min.jsonld>
>>>
>>>     Minimum + Recommended markup using ProteinEntity
>>>     https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/ProteinEntity-min%2Brec.jsonld
>>>     <https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/ProteinEntity-min+rec.jsonld>
>>>
>>>     While the markup in these examples using ProteinEntity is easier
>>>     to interpret, the number of items that need to be changed to
>>>     markup another protein is the same as in the BioChemEntity
>>>     approach. The simplified markup should enable easier adoption,
>>>     although we could help the current proposal of using
>>>     BioChemEntity by using highlighting on the Bioschemas site to
>>>     show which terms need to be changed.
>>>
>>>     A major downside of this approach is that we would need to add
>>>     all the types to Schema.org <http://schema.org/> or host them at
>>>     Bioschemas.org <http://bioschemas.org/>. While these could be
>>>     mapped to existing terms, we would be accused of duplicating
>>>     existing ontology terms.
>>>
>>>     *Direct term reuse example*
>>>     Last week, I showed the above examples to Dan (we were at ISWC
>>>     together). He pointed out that the additionalProperty relation
>>>     was added to allow the use of property/value pairs where the
>>>     properties do not exist in an ontology. We are in the situation
>>>     where the properties we are using come from ontologies. Dan
>>>     suggested that we just use them directly. Note that the example
>>>     also exploits the fact that you can define multiple types.
>>>
>>>     Minimum markup using BioChemEntity and term reuse
>>>     https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/BioChemEntityAlt-min.jsonld
>>>     <https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/BioChemEntityAlt-min.jsonld>
>>>
>>>     Minimum + Recommended markup using BioChemEntity and term reuse
>>>     https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/BioChemEntityAlt-min%2Brec.jsonld
>>>     <https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/BioChemEntityAlt-min+rec.jsonld>
>>>
>>>     As you will see, this seems to have the advantages of both the
>>>     above approaches. The markup is more straightforward than the
>>>     additionalProperty approach, but exploits reusing existing
>>>     domain ontologies. The tooling and exploitation will be much
>>>     more straightforward.
>>>
>>>     I invite you all to review and comment on these different
>>>     examples. Do we believe that the BioChemEntity with term reuse
>>>     (the third set of examples) is an appropriate path going forward?
>>>
>>>     Best regards
>>>
>>>     Alasdair
>>>
>>>     PS Sorry for the long email
>>>
>>>     Alasdair J G Gray
>>>
>>>     Fellow of the Higher Education Academy
>>>     Assistant Professor in Computer Science,
>>>     School of Mathematical and Computer Sciences
>>>     (Athena SWAN Bronze Award)
>>>     Heriot-Watt University, Edinburgh UK.
>>>
>>>     Email: A.J.G.Gray@hw.ac.uk <mailto:A.J.G.Gray@hw.ac.uk>
>>>     Web: http://www.macs.hw.ac.uk/~ajg33
>>>     <http://www.macs.hw.ac.uk/%7Eajg33>
>>>     ORCID: http://orcid.org/0000-0002-5711-4872
>>>     <http://orcid.org/0000-0002-5711-4872>
>>>     Office: Earl Mountbatten Building 1.39
>>>     Twitter: @gray_alasdair
>>>
>>>     ------------------------------------------------------------------------
>>>
>>>     */Heriot-Watt University is The Times & The Sunday Times
>>>     International University of the Year 2018/*
>>>
>>>     Founded in 1821, Heriot-Watt is a leader in ideas and solutions.
>>>     With campuses and students across the entire globe we span the
>>>     world, delivering innovation and educational excellence in
>>>     business, engineering, design and the physical, social and life
>>>     sciences.
>>>
>>>     This email is generated from the Heriot-Watt University Group,
>>>     which includes:
>>>
>>>      1. Heriot-Watt University, a Scottish charity registered under
>>>         number SC000278
>>>      2. Edinburgh Business School a Charity Registered in Scotland,
>>>         SC026900. Edinburgh Business School is a company limited by
>>>         guarantee, registered in Scotland with registered number
>>>         SC173556 and registered office at Heriot-Watt University
>>>         Finance Office, Riccarton, Currie, Midlothian, EH14 4AS
>>>      3. Heriot- Watt Services Limited (Oriam), Scotland's national
>>>         performance centre for sport. Heriot-Watt Services Limited
>>>         is a private limited company registered is Scotland with
>>>         registered number SC271030 and registered office at Research
>>>         & Enterprise Services Heriot-Watt University, Riccarton,
>>>         Edinburgh, EH14 4AS.
>>>
>>>     The contents (including any attachments) are confidential. If
>>>     you are not the intended recipient of this e-mail, any
>>>     disclosure, copying, distribution or use of its contents is
>>>     strictly prohibited, and you should please notify the sender
>>>     immediately and then delete it (including any attachments) from
>>>     your system.
>>>
>>>
>>
>
> Alasdair J G Gray
>
> Fellow of the Higher Education Academy
> Assistant Professor in Computer Science,
> School of Mathematical and Computer Sciences
> (Athena SWAN Bronze Award)
> Heriot-Watt University, Edinburgh UK.
>
> Email: A.J.G.Gray@hw.ac.uk <mailto:A.J.G.Gray@hw.ac.uk>
> Web: http://www.macs.hw.ac.uk/~ajg33 <http://www.macs.hw.ac.uk/%7Eajg33>
> ORCID: http://orcid.org/0000-0002-5711-4872
> Office: Earl Mountbatten Building 1.39
> Twitter: @gray_alasdair
>
Received on Thursday, 2 November 2017 10:35:08 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:08:00 UTC