W3C home > Mailing lists > Public > public-bioschemas@w3.org > November 2017

Re: Protein representation with and without BioChemEntity

From: Melanie Courtot <mcourtot@ebi.ac.uk>
Date: Thu, 2 Nov 2017 10:02:50 +0000
To: Justin Clark-Casey <justinccdev@gmail.com>, "Gray, Alasdair J G" <A.J.G.Gray@hw.ac.uk>
Cc: "public-bioschemas@w3.org" <public-bioschemas@w3.org>
Message-ID: <28f29345-ebff-d5bd-106c-aba6eb7d198c@ebi.ac.uk>

I'm wondering if we could take it a step further, and instead of 
defining specific properties we could just reuse terms from RO (or else)?

For example, "http://semanticscience.org/resource/is-transcribed-from" 
could be replaced by http://purl.obolibrary.org/obo/RO_0002510, 
"transcribed from", and "isContainedIn" could be 
http://purl.obolibrary.org/obo/RO_0001018, "contained in".


Mélanie Courtot, PhD
GA4GH/BioSamples Project lead
European Bioinformatics Institute (EMBL-EBI)

On 01/11/2017 16:18, Justin Clark-Casey wrote:
> Direct term reuse sounds like a good choice to me, especially as
> a) it's the mechanism that schema.org <http://schema.org> themselves 
> have to add existing ontology classes and terms to the structured data
> b) will make applications much easier to write as they can use 
> existing general tooling
> c) allows us to do everything we were doing with AdditionalProperty and
> d) still allows us to define profiles without having to move 
> everything through schema.org <http://schema.org>
> -- Justin
> On Wed, Nov 1, 2017 at 3:56 PM, Gray, Alasdair J G 
> <A.J.G.Gray@hw.ac.uk <mailto:A.J.G.Gray@hw.ac.uk>> wrote:
>     Hi All,
>     Apologies for the delay in sending this email. I have been working
>     with Carole on submitting an Implementation Study proposal to the
>     Data Platform for more work on Bioschemas.
>     For representing a specific bioscience type, e.g. a protein, we
>     currently have a proposal for using a generic wrapper approach
>     that we then specialise, e.g. BioChemEntity specialised with a
>     Protein profile.
>     Protein profile
>     http://bioschemas.org/specifications/Protein/specification/
>     <http://bioschemas.org/specifications/Protein/specification/>
>     BioChemEntity type
>     http://bioschemas.org/specifications/BioChemEntity/specification/
>     <http://bioschemas.org/specifications/BioChemEntity/specification/>
>     To help understand the various advantages and disadvantages of
>     this approach, Kenneth and I have drawn up an example of marking
>     up a specific protein first using the current proposal and second
>      if we were to do the same with a specific ProteinEntity. Below
>     are the examples and some analysis of them.
>     *BioChemEntity Example*
>     Minimum markup using BioChemEntity
>     https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/BioChemEntity-min.jsonld
>     <https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/BioChemEntity-min.jsonld>
>     Minimum + Recommended markup using BioChemEntity
>     https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/BioChemEntity-min%2Brec.jsonld
>     <https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/BioChemEntity-min+rec.jsonld>
>     One thing to note is that the minimum + recommended markup is not
>     an additive extension of the minimum markup. Due to the use of the
>     AdditionalProperty relationship, you need to use an JSON array and
>     add the properties from the recommended level within the existing
>     array.
>     An advantage of this approach is that it reuses terms from
>     existing ontologies and we can represent types that do not
>     currently exist in Schema.org <http://schema.org>, e.g. Genes,
>     Chemicals, etc.
>     *ProteinEntity example*
>     Minimum markup using ProteinEntity
>     https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/ProteinEntity-min.jsonld
>     <https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/ProteinEntity-min.jsonld>
>     Minimum + Recommended markup using ProteinEntity
>     https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/ProteinEntity-min%2Brec.jsonld
>     <https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/ProteinEntity-min+rec.jsonld>
>     While the markup in these examples using ProteinEntity is easier
>     to interpret, the number of items that need to be changed to
>     markup another protein is the same as in the BioChemEntity
>     approach. The simplified markup should enable easier adoption,
>     although we could help the current proposal of using BioChemEntity
>     by using highlighting on the Bioschemas site to show which terms
>     need to be changed.
>     A major downside of this approach is that we would need to add all
>     the types to Schema.org <http://schema.org> or host them at
>     Bioschemas.org <http://bioschemas.org>. While these could be
>     mapped to existing terms, we would be accused of duplicating
>     existing ontology terms.
>     *Direct term reuse example*
>     Last week, I showed the above examples to Dan (we were at ISWC
>     together). He pointed out that the additionalProperty relation was
>     added to allow the use of property/value pairs where the
>     properties do not exist in an ontology. We are in the situation
>     where the properties we are using come from ontologies. Dan
>     suggested that we just use them directly. Note that the example
>     also exploits the fact that you can define multiple types.
>     Minimum markup using BioChemEntity and term reuse
>     https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/BioChemEntityAlt-min.jsonld
>     <https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/BioChemEntityAlt-min.jsonld>
>     Minimum + Recommended markup using BioChemEntity and term reuse
>     https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/BioChemEntityAlt-min%2Brec.jsonld
>     <https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/BioChemEntityAlt-min+rec.jsonld>
>     As you will see, this seems to have the advantages of both the
>     above approaches. The markup is more straightforward than the
>     additionalProperty approach, but exploits reusing existing domain
>     ontologies. The tooling and exploitation will be much more
>     straightforward.
>     I invite you all to review and comment on these different
>     examples. Do we believe that the BioChemEntity with term reuse
>     (the third set of examples) is an appropriate path going forward?
>     Best regards
>     Alasdair
>     PS Sorry for the long email
>     Alasdair J G Gray
>     Fellow of the Higher Education Academy
>     Assistant Professor in Computer Science,
>     School of Mathematical and Computer Sciences
>     (Athena SWAN Bronze Award)
>     Heriot-Watt University, Edinburgh UK.
>     Email: A.J.G.Gray@hw.ac.uk <mailto:A.J.G.Gray@hw.ac.uk>
>     Web: http://www.macs.hw.ac.uk/~ajg33
>     <http://www.macs.hw.ac.uk/%7Eajg33>
>     ORCID: http://orcid.org/0000-0002-5711-4872
>     <http://orcid.org/0000-0002-5711-4872>
>     Office: Earl Mountbatten Building 1.39
>     Twitter: @gray_alasdair
>     ------------------------------------------------------------------------
>     */Heriot-Watt University is The Times & The Sunday Times
>     International University of the Year 2018/*
>     Founded in 1821, Heriot-Watt is a leader in ideas and solutions.
>     With campuses and students across the entire globe we span the
>     world, delivering innovation and educational excellence in
>     business, engineering, design and the physical, social and life
>     sciences.
>     This email is generated from the Heriot-Watt University Group,
>     which includes:
>      1. Heriot-Watt University, a Scottish charity registered under
>         number SC000278
>      2. Edinburgh Business School a Charity Registered in Scotland,
>         SC026900. Edinburgh Business School is a company limited by
>         guarantee, registered in Scotland with registered number
>         SC173556 and registered office at Heriot-Watt University
>         Finance Office, Riccarton, Currie, Midlothian, EH14 4AS
>      3. Heriot- Watt Services Limited (Oriam), Scotland's national
>         performance centre for sport. Heriot-Watt Services Limited is
>         a private limited company registered is Scotland with
>         registered number SC271030 and registered office at Research &
>         Enterprise Services Heriot-Watt University, Riccarton,
>         Edinburgh, EH14 4AS.
>     The contents (including any attachments) are confidential. If you
>     are not the intended recipient of this e-mail, any disclosure,
>     copying, distribution or use of its contents is strictly
>     prohibited, and you should please notify the sender immediately
>     and then delete it (including any attachments) from your system.
Received on Thursday, 2 November 2017 10:04:03 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:08:00 UTC