- From: Leyla Garcia <ljgarcia@ebi.ac.uk>
- Date: Wed, 8 Nov 2017 16:14:18 +0000
- To: "Gray, Alasdair J G" <A.J.G.Gray@hw.ac.uk>, "public-bioschemas@w3.org" <public-bioschemas@w3.org>, Dan Brickley <danbri@danbri.org>, Anders Riutta <anders.riutta@gladstone.ucsf.edu>
- Message-ID: <930210da-ab03-2e38-c9fd-2ddf07467bd3@ebi.ac.uk>
Hi all, On a different thread "Protein representation with and without BioChemEntity", based on the Protein model and examples, Alasdair presented three options to deal with our additional types and properties. I want to add an fourth option which mixes the second and third options. In this option, we define a Bioschemas context for those minimum and recommended properties corresponding to specializations of what BioChemEntity offers. But we still allow the direct term reuse. Adding yet one more context is always possible (just keep in mind what Anders Riutta have already mentioned about multiple contexts). Those terms coming from schema.org and Bioschemas will be parsed and validated by Bioschemas tools. Any additional term coming from a third party direct term reuse or context will not. The additional type could be any preferred one by the data provider and the additional properties should be avoid whenever a third party direct reuse or context is possible. An example for proteins including minimum (identifier), recommended (from additionalType to bioschemas:transcribedFrom) and optional (additional property "protein is-related-to clan" using SIO:000001) can be found at https://github.com/BioSchemas/specifications/blob/master/Protein/examples/ProteinEntity-with-bioschemas-context.json Some parts taken from Alasdair's and Anders' emails: * Options proposed by Alasdair (mail from 01.Nov.2017) *BioChemEntity Example* Minimum markup using BioChemEntity https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/BioChemEntity-min.jsonld *ProteinEntity example* Minimum markup using ProteinEntity https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/ProteinEntity-min.jsonld *Direct term reuse example* Last week, I showed the above examples to Dan (we were at ISWC together). He pointed out that the additionalProperty relation was added to allow the use of property/value pairs where the properties do not exist in an ontology. We are in the situation where the properties we are using come from ontologies. Dan suggested that we just use them directly. Note that the example also exploits the fact that you can define multiple types. Minimum markup using BioChemEntity and term reuse https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/BioChemEntityAlt-min.jsonld * Comments about multiple contexts by Anders Riutta (mail from 07.Nov.2017) It could look like this for annotating a pre-existing JSON API <https://www.oclc.org/developer/news/2016/retrofitting-an-existing-api-with-linked-data.en.html>. The Bioschemas context would specify one alias, e.g.: { ..., transcribedFrom: http://semanticscience.org/resource/is-transcribed-from <http://semanticscience.org/resource/is-transcribed-from>, ... } But if the pre-existing API uses the term "transcfrom" then it would need to use a combined context like this: [ { ..., transcfrom: http://semanticscience.org/resource/is-transcribed-from <http://semanticscience.org/resource/is-transcribed-from>, ... } , "http://bioschemas.org/context.jsonld" ] One point of caution: the JSON-LD spec says <https://json-ld.org/spec/latest/json-ld/#advanced-context-usage>, "Multiple contexts may be combined using an array, which is processed in order," but that doesn't necessarily mean the Bioschemas term would take precedence for the combined context above. For example, if the output from the API were to be expanded and re-compacted with that combined context, the result could use the term "transcfrom" instead of "transcribedFrom", as specified by the term selection algorithm <https://www.w3.org/TR/json-ld-api/#term-selection> and discussed in this comment <https://github.com/digitalbazaar/jsonld.js/issues/75#issuecomment-61841449>. Regards, On 01/11/2017 15:56, Gray, Alasdair J G wrote: > Hi All, > > Apologies for the delay in sending this email. I have been working > with Carole on submitting an Implementation Study proposal to the Data > Platform for more work on Bioschemas. > > For representing a specific bioscience type, e.g. a protein, we > currently have a proposal for using a generic wrapper approach that we > then specialise, e.g. BioChemEntity specialised with a Protein profile. > > Protein profile > http://bioschemas.org/specifications/Protein/specification/ > BioChemEntity type > http://bioschemas.org/specifications/BioChemEntity/specification/ > > To help understand the various advantages and disadvantages of this > approach, Kenneth and I have drawn up an example of marking up a > specific protein first using the current proposal and second if we > were to do the same with a specific ProteinEntity. Below are the > examples and some analysis of them. > > *BioChemEntity Example* > Minimum markup using BioChemEntity > https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/BioChemEntity-min.jsonld > > Minimum + Recommended markup using BioChemEntity > https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/BioChemEntity-min%2Brec.jsonld > <https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/BioChemEntity-min+rec.jsonld> > > One thing to note is that the minimum + recommended markup is not an > additive extension of the minimum markup. Due to the use of the > AdditionalProperty relationship, you need to use an JSON array and add > the properties from the recommended level within the existing array. > > An advantage of this approach is that it reuses terms from existing > ontologies and we can represent types that do not currently exist in > Schema.org <http://schema.org>, e.g. Genes, Chemicals, etc. > > *ProteinEntity example* > Minimum markup using ProteinEntity > https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/ProteinEntity-min.jsonld > > Minimum + Recommended markup using ProteinEntity > https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/ProteinEntity-min%2Brec.jsonld > <https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/ProteinEntity-min+rec.jsonld> > > While the markup in these examples using ProteinEntity is easier to > interpret, the number of items that need to be changed to markup > another protein is the same as in the BioChemEntity approach. The > simplified markup should enable easier adoption, although we could > help the current proposal of using BioChemEntity by using highlighting > on the Bioschemas site to show which terms need to be changed. > > A major downside of this approach is that we would need to add all the > types to Schema.org <http://schema.org> or host them at Bioschemas.org > <http://bioschemas.org>. While these could be mapped to existing > terms, we would be accused of duplicating existing ontology terms. > > *Direct term reuse example* > Last week, I showed the above examples to Dan (we were at ISWC > together). He pointed out that the additionalProperty relation was > added to allow the use of property/value pairs where the properties do > not exist in an ontology. We are in the situation where the properties > we are using come from ontologies. Dan suggested that we just use them > directly. Note that the example also exploits the fact that you can > define multiple types. > > Minimum markup using BioChemEntity and term reuse > https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/BioChemEntityAlt-min.jsonld > > Minimum + Recommended markup using BioChemEntity and term reuse > https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/BioChemEntityAlt-min%2Brec.jsonld > <https://github.com/BioSchemas/specifications/blob/master/PhysicalEntity/examples/BioChemEntityAlt-min+rec.jsonld> > > As you will see, this seems to have the advantages of both the above > approaches. The markup is more straightforward than the > additionalProperty approach, but exploits reusing existing domain > ontologies. The tooling and exploitation will be much more > straightforward. > > I invite you all to review and comment on these different examples. Do > we believe that the BioChemEntity with term reuse (the third set of > examples) is an appropriate path going forward? > > Best regards > > Alasdair > > PS Sorry for the long email > > Alasdair J G Gray > > Fellow of the Higher Education Academy > Assistant Professor in Computer Science, > School of Mathematical and Computer Sciences > (Athena SWAN Bronze Award) > Heriot-Watt University, Edinburgh UK. > > Email: A.J.G.Gray@hw.ac.uk <mailto:A.J.G.Gray@hw.ac.uk> > Web: http://www.macs.hw.ac.uk/~ajg33 <http://www.macs.hw.ac.uk/%7Eajg33> > ORCID: http://orcid.org/0000-0002-5711-4872 > Office: Earl Mountbatten Building 1.39 > Twitter: @gray_alasdair > > Untitled Document > ------------------------------------------------------------------------ > > */Heriot-Watt University is The Times & The Sunday Times International > University of the Year 2018/* > > Founded in 1821, Heriot-Watt is a leader in ideas and solutions. With > campuses and students across the entire globe we span the world, > delivering innovation and educational excellence in business, > engineering, design and the physical, social and life sciences. > > This email is generated from the Heriot-Watt University Group, which > includes: > > 1. Heriot-Watt University, a Scottish charity registered under number > SC000278 > 2. Edinburgh Business School a Charity Registered in Scotland, > SC026900. Edinburgh Business School is a company limited by > guarantee, registered in Scotland with registered number SC173556 > and registered office at Heriot-Watt University Finance Office, > Riccarton, Currie, Midlothian, EH14 4AS > 3. Heriot- Watt Services Limited (Oriam), Scotland's national > performance centre for sport. Heriot-Watt Services Limited is a > private limited company registered is Scotland with registered > number SC271030 and registered office at Research & Enterprise > Services Heriot-Watt University, Riccarton, Edinburgh, EH14 4AS. > > The contents (including any attachments) are confidential. If you are > not the intended recipient of this e-mail, any disclosure, copying, > distribution or use of its contents is strictly prohibited, and you > should please notify the sender immediately and then delete it > (including any attachments) from your system. >
Received on Wednesday, 8 November 2017 16:14:48 UTC