W3C home > Mailing lists > Public > public-bioschemas@w3.org > May 2019

Re: BioSamples type for review

From: Matt Styles <Matt.Styles@nottingham.ac.uk>
Date: Mon, 20 May 2019 19:48:48 +0000
To: Chris Mungall <cjmungall@lbl.gov>
CC: "Gray, Alasdair J G" <a.j.g.gray@hw.ac.uk>, "public-bioschemas@w3.org" <public-bioschemas@w3.org>
Message-ID: <AM6PR06MB54739CC83808147240007A80AB060@AM6PR06MB5473.eurprd06.prod.outlook.com>
Other bio samples other than human samples were considered, e.g. plants.

Is there an example you could provide of an 'environmental sample' where BioSample would be completely incompatible?

Not to say that we can't create sub types, but if we took trees or fossils as examples, the majority of BioSample properties apply, so we need to understand what other kinds of concrete biological samples exist where BioSample would be wholly incompatible.

It is worth remembering that Types are different to Profiles.

It is not uncommon for Types to have a large number of properties, so if there were other properties which would be applicable to some other (currently undefined) BioSample, they could be added. As with everything in schema.org, all properties of all Types are optional. If we consider the Location Type for example, all of the geo properties may be completely irrelevant to the body organ from which an animal tissue sample was taken.

Profiles can be used by relevant communities to provide recommendations on which properties apply.

This is already true in Bioschemas with a number of profiles being defined. So a human tissue community may define a Profile on top of the BioSample Type, which specifies which properties should be minimum, recommended, optional.

The difficulty with sub typing is that it creates a lot of duplication, and we would also need proposals for these new types. In the case of plant and human and animal samples, all of the proposed BioSample properties apply so we would create three identical sub types.

I think if we are going to consider this level of sub typing then we need concrete proposals for their properties and for each one we should have at least one live example somewhere on the web.

Happy to hear thoughts..

Matt

Get Outlook for Android<https://aka.ms/ghei36>

From: Chris Mungall <cjmungall@lbl.gov>
Sent: Monday, May 20, 2019 6:55:33 PM
To: Matt Styles
Cc: Gray, Alasdair J G; public-bioschemas@w3.org
Subject: Re: BioSamples type for review

The general issue is that the existing schema is just a poor match for environmental samples. No "environment" property. Perhaps "material" is to be used for this? Properties that are inapplicable or confusing in the context of an environmental biosample. E.g. how would "age" be interpreted in for a soil sample?
http://sdo-bioschemas-227516.appspot.com/BioSample

I think the use cases driving the current design were clearly all from tissue sample perspective (here interpreting tissue as any piece of an organism), so we avoid problems by not claiming the broad name BioSample for a more specific use case, e.g. rename as TissueSample. This leaves open the possibility of an EnvironmentalBioSample at some future date with adequate representation from the necessary communities [I'm sure there are a few on this list but many may not be checking email as they are at GSC this week], and also the possibility of creating a broader BioSample grouping class.

On Mon, May 20, 2019 at 9:55 AM Matt Styles < Matt.Styles@nottingham.ac.uk<mailto:Matt.Styles@nottingham.ac.uk>> wrote:
Sorry, examples of properties you were referring to which would be problematic?

Matt Styles
Senior Research Developer

Suite 221 46 Eversholt Street,
Euston,
London,
NW1 1DA

+44 (0) 115 74 85125 | nottingham.ac.uk<http://nottingham.ac.uk/>


[b0]



Follow us
Facebook.com/TheUniofNottingham<http://facebook.com/TheUniofNottingham>
Twitter.com/UniofNottingham<http://twitter.com/UniofNottingham>
Youtube.com/nottmuniversity<http://youtube.com/nottmuniversity>
Instagram.com/uniofnottingham<http://instagram.com/uniofnottingham>
Linkedin.com/company/university-of-nottingham<http://linkedin.com/company/university-of-nottingham>

From: Chris Mungall <cjmungall@lbl.gov<mailto:cjmungall@lbl.gov>>
Sent: 20 May 2019 17:36
To: Matt Styles <uczms@exmail.nottingham.ac.uk<mailto:uczms@exmail.nottingham.ac.uk>>
Cc: Gray, Alasdair J G <A.J.G.Gray@hw.ac.uk<mailto:A.J.G.Gray@hw.ac.uk>>; public-bioschemas@w3.org<mailto:public-bioschemas@w3.org>
Subject: Re: BioSamples type for review



On Mon, May 20, 2019 at 8:57 AM Matt Styles <Matt.Styles@nottingham.ac.uk<mailto:Matt.Styles@nottingham.ac.uk>> wrote:
Do you have some examples?

https://gold.jgi.doe.gov/biosamples?Biosample.Ecosystem=Environmental&Biosample.Specimen=biome&Biosample.Is+Public=Yes
https://www.ebi.ac.uk/metagenomics/search#samples


It was a face-to-face meeting.

Matt Styles
Senior Research Developer

Suite 221 46 Eversholt Street,
Euston,
London,
NW1 1DA

+44 (0) 115 74 85125 | nottingham.ac.uk<http://nottingham.ac.uk/>


[b0]



Follow us
Facebook.com/TheUniofNottingham<http://facebook.com/TheUniofNottingham>
Twitter.com/UniofNottingham<http://twitter.com/UniofNottingham>
Youtube.com/nottmuniversity<http://youtube.com/nottmuniversity>
Instagram.com/uniofnottingham<http://instagram.com/uniofnottingham>
Linkedin.com/company/university-of-nottingham<http://linkedin.com/company/university-of-nottingham>

From: Chris Mungall <cjmungall@lbl.gov<mailto:cjmungall@lbl.gov>>
Sent: 20 May 2019 16:49
To: Matt Styles <uczms@exmail.nottingham.ac.uk<mailto:uczms@exmail.nottingham.ac.uk>>
Cc: Gray, Alasdair J G <A.J.G.Gray@hw.ac.uk<mailto:A.J.G.Gray@hw.ac.uk>>; public-bioschemas@w3.org<mailto:public-bioschemas@w3.org>
Subject: Re: BioSamples type for review

Hi Matt,

Did you discuss environmental biosamples? I agree plant and animal biosample would be similar and I would not propose making separate subclasses here. But environmental biosamples may have vastly different properties.

When you say the general consensus, was this a discussion on github or a telecon? How does one get involved in guiding the general consensus?

On Mon, May 20, 2019 at 7:13 AM Matt Styles <Matt.Styles@nottingham.ac.uk<mailto:Matt.Styles@nottingham.ac.uk>> wrote:
Yes, thinking about this structure..

The general consensus of us discussing the BioSample type was that it would be a child of BioChemEntity.

I think, though open to thoughts, is that over time there may be a need for a general Sample type, but presumably this wouldn’t be difficult to add retrospectively because it would only add properties to, rather than modify existing properties of, BioSample (GeoSample, etc). The ‘open-closed principle’ of software development.

We discussed the difference between e.g. PlantSample vs HumanSample (for example), but pretty much all the properties we came up with applied equally to both, hence keeping it simple (KISS!) with BioSample.

Hope this gives some context to how the proposals evolved..

Thanks,

Matt

Matt Styles
Senior Research Developer

Suite 221 46 Eversholt Street,
Euston,
London,
NW1 1DA

+44 (0) 115 74 85125 | nottingham.ac.uk<http://nottingham.ac.uk/>


[b0]



Follow us
Facebook.com/TheUniofNottingham<http://facebook.com/TheUniofNottingham>
Twitter.com/UniofNottingham<http://twitter.com/UniofNottingham>
Youtube.com/nottmuniversity<http://youtube.com/nottmuniversity>
Instagram.com/uniofnottingham<http://instagram.com/uniofnottingham>
Linkedin.com/company/university-of-nottingham<http://linkedin.com/company/university-of-nottingham>

From: Chris Mungall <cjmungall@lbl.gov<mailto:cjmungall@lbl.gov>>
Sent: 17 May 2019 23:55
To: Gray, Alasdair J G <A.J.G.Gray@hw.ac.uk<mailto:A.J.G.Gray@hw.ac.uk>>
Cc: Matt Styles <uczms@exmail.nottingham.ac.uk<mailto:uczms@exmail.nottingham.ac.uk>>; public-bioschemas@w3.org<mailto:public-bioschemas@w3.org>
Subject: Re: BioSamples type for review

Comments below..

On Wed, May 15, 2019 at 2:55 AM Gray, Alasdair J G <A.J.G.Gray@hw.ac.uk<mailto:A.J.G.Gray@hw.ac.uk>> wrote:
Hi

I think it is clear that we need to define some properties for BioSample rather than continue to rely on an approach that would permit anything. Although as Chris highlighted we are on the Web so anything goes, but let us try to provide a vocabulary of terms within schema.org<http://schema.org> that enable resources to become findable on the web.

On 13 May 2019, at 16:26, Chris Mungall <cjmungall@lbl.gov<mailto:cjmungall@lbl.gov>> wrote:


If there is another type of sample which is not covered by BioSample then I think it would be worth considering, providing we have some examples that we could mark up today.

This goes back to my question about scope. If the scope is the same as ebi/ncbi biosamples and includes environmental samples then there is a lot missing.

If the scope is tissue samples from organisms then I recommend relabeling to make this clearer, but even here there are clear gaps, e.g. no way to indicate the tissue of origin e.g with an uberon ID.

To evaluate the list of properties I recommend looking at the relevant set of MIxS templates that are in scope (whether this is just biomedical or includes environmental)

The scope of the type is really up for discussion, but we need to decide on this soon. We would need to see a concrete example of what a GeoSample would be. Would it make sense to propose this as a sibling type to BioSample and have both inherit from a more generic Sample type, i.e.
Thing
- Sample
  - BioSample
  - GeoSample

This would also eliminate the inheritance of properties from the BioChemEntity type, although some of those were appropriate, e.g. associatedDisease.

I'm not sure of the philosophy of polymoprhism in schema.org<http://schema.org> other than 'keep it simple', but I think this approach would work best. Schema.org does allow multiple inheritance so you could in theory have biosample inherit from both sample and something like BioChemEntity, but AFAICT this doesn't seem that common, and there seems to be a lack of trait/mixin classes. Maybe some repetition of properties is fine.

How deep should the inheritance hierarchy go? I think subdividing biosample into TissueSample and EnvironmentalBioSample would make sense as these will have specific properties (although some overlap, in the case of host-associated environmental samples).

You could go even further and subdivide environmental sample into the different MIxS profiles (e.g SoilSample would have soil electroconductivity property, depth property). This would have a lot of advantages but seems to be not quite in the spirit of schema.org<http://schema.org>.




Note that there is notion of sample in the existing Biomedical extension of schema.org<http://schema.org>. There are some specific types under MedicalTest that mention using a sample:
https://schema.org/BloodTest
https://schema.org/PathologyTest which also has a property of tissueSample

hmm, seems a bit ad-hoc

We should also be aware that there is a property called sampleType, but this is defined in the context of a computer programme code sample with a more specific codeSampleType property as well.

also statistical samples. Maybe MaterialSample will help clarify this, at the risk of sounding too ontological




On 13 May 2019, at 15:51, Chris Mungall <cjmungall@lbl.gov<mailto:cjmungall@lbl.gov>> wrote:

Is location the location of the sample source or where the sample is stored? Important to have clear semantics for this for environmental samples.

I think we want to use itemLocation and locationCreated to make this distinction clear. These are both existing terms in schema.org<http://schema.org>.

On 13 May 2019, at 15:51, Chris Mungall <cjmungall@lbl.gov<mailto:cjmungall@lbl.gov>> wrote:

The material field seems a bit odd "A material that something is made from, e.g. leather, wool, cotton, paper.”

What should we use instead?

On 13 May 2019, at 15:51, Chris Mungall <cjmungall@lbl.gov<mailto:cjmungall@lbl.gov>> wrote:

I don't understand how these fields are intended to be used: bioChemInteraction, bioChemSimilarity, hasMolecularFunction, [most of them]

These are due to the inheritance from BioChemEntity which if we go with the type proposal above would not then come across. There were a few that were indicated as being needed, viz, associatedDisease, taxonimicRange. If we do keep BioSample inheriting from BioChemEntity, then the profile defined over it would make clear which of the properties are intended for use.

Best regards

Alasdair

--
Alasdair J G Gray
Associate Professor in Computer Science,
School of Mathematical and Computer Sciences
Heriot-Watt University, Edinburgh, UK.

Email: A.J.G.Gray@hw.ac.uk<mailto:A.J.G.Gray@hw.ac.uk>
Web: http://www.macs.hw.ac.uk/~ajg33
ORCID: http://orcid.org/0000-0002-5711-4872
Office: Earl Mountbatten Building 1.39
Twitter: @gray_alasdair

To arrange a meeting: http://doodle.com/ajggray

Heriot-Watt University is The Times & The Sunday Times International University of the Year 2018
Founded in 1821, Heriot-Watt is a leader in ideas and solutions. With campuses and students across the entire globe we span the world, delivering innovation and educational excellence in business, engineering, design and the physical, social and life sciences. This email is generated from the Heriot-Watt University Group, which includes:
1.      Heriot-Watt University, a Scottish charity registered under number SC000278
2.      Edinburgh Business School a Charity Registered in Scotland, SC026900. Edinburgh Business School is a company limited by guarantee, registered in Scotland with registered number SC173556 and registered office at Heriot-Watt University Finance Office, Riccarton, Currie, Midlothian, EH14 4AS
3.      Heriot- Watt Services Limited (Oriam), Scotland's national performance centre for sport. Heriot-Watt Services Limited is a private limited company registered is Scotland with registered number SC271030 and registered office at Research & Enterprise Services Heriot-Watt University, Riccarton, Edinburgh, EH14 4AS.
The contents (including any attachments) are confidential. If you are not the intended recipient of this e-mail, any disclosure, copying, distribution or use of its contents is strictly prohibited, and you should please notify the sender immediately and then delete it (including any attachments) from your system.
This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment.   Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law.
This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment.   Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law.
This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham. Email communications with the University of Nottingham may be monitored where permitted by law.



This message and any attachment are intended solely for the addressee
and may contain confidential information. If you have received this
message in error, please contact the sender and delete the email and
attachment. 

Any views or opinions expressed by the author of this email do not
necessarily reflect the views of the University of Nottingham. Email
communications with the University of Nottingham may be monitored 
where permitted by law.







image004.jpg
(image/jpeg attachment: image004.jpg)

image005.jpg
(image/jpeg attachment: image005.jpg)

image006.jpg
(image/jpeg attachment: image006.jpg)

Received on Monday, 20 May 2019 19:49:22 UTC

This archive was generated by hypermail 2.3.1 : Monday, 20 May 2019 19:49:23 UTC