W3C home > Mailing lists > Public > public-bioschemas@w3.org > March 2018

Re: [Proposal] CategoryCode as valid type for valueReference for any PropertyValue in Bioschemas/schema.org

From: ljgarcia <ljgarcia@ebi.ac.uk>
Date: Thu, 22 Mar 2018 16:56:29 +0000
To: Justin Clark-Casey <justinccdev@gmail.com>
Cc: Justin Clark-Casey <jc955@cam.ac.uk>, public-bioschemas@w3.org
Message-ID: <ef61f53d8e26d4194f3b41970e2321ab@ebi.ac.uk>
Hi Justin,

The additionalType might be useful for data providers or consumers if 
they want to link to other resources not necessarily using Bioschemas. 
It is also a way to link to your own ontology. For instance, I imagine 
WikiData preferring to use their types, so if they are specified as 
additionalType, then they can reference to their protein or gene or so 
type in WikiData.

Cheers,

On 2018-03-22 16:11, Justin Clark-Casey wrote:
> Thanks Leyla for the edits, much appreciated.
> 
> I hadn't realized from the old examples that BioChemEntity was now
> being specified through multiple inheritance directly, e.g.
> 
> {
>     "@context": "http://schema.org",
>     "@type": ["BioChemEntity",
> "http://purl.obolibrary.org/obo/PR_000000001"],
>     "additionalType":
> "http://semanticscience.org/resource/SIO_010043",
>     ...
> }
> 
> with "http://purl.obolibrary.org/obo/PR_000000001" as the mandatory
> string for protein.  In this case, though, what is the purpose of also
> giving additionalType (a recommended property)?  To optionally specify
> the type further in a less controlled manner?
> 
> On Wed, Mar 21, 2018 at 6:53 PM, ljgarcia <ljgarcia@ebi.ac.uk> wrote:
> 
>> Hi Justin,
>> 
>> Thanks for this initiative, nice summary for additionalProperty and
>> its alternative via direct reuse of properties coined in other
>> controlled vocabularies.
>> 
>> There were some issues regarding the use of additionalType so I made
>> some editions to the first and second sections. Feel free to ping me
>> if you have any questions or want to discuss further.
>> 
>> Regards,
>> 
>> On 2018-03-21 17:59, Justin Clark-Casey wrote:
>> I think you're right, this could be 2 distinct things.
>> 
>> I recently read "Schema.org: Evolution of Structured Data on the
>> Web"
>> [1] and it was very illuminating as to the philosophy of schema.org
>> [1].
>> Namely that:
>> 
>> * Things should be much easier for the data publishers and harder
>> for
>> the consumers
>> * Developers chiefly implement by adapting examples (we knew this)
>> * Getting initial adoption is much more important than getting the
>> structures optimal upfront.  Once there is adoption, that's
>> justification to improve structure if necessary.
>> 
>> So I agree with you - specifying sample relations through
>> additionalProperty is easiest and specifying more universal
>> per-profile relations (e.g. amino acid sequence on protein) could be
>> done through direct additional relations to make validation easier.
>> 
>> To get additional relations (and the general
>> BioChemEntity/DataRecord
>> mechanisms) more straight in my head, I published a wiki page [2].
>> Apologies for any mistakes, please anybody feel free to edit/extend
>> and I will do so as necessary.  I ended up repeating quite a bit of
>> what Alasdair originally wrote [3] and what is in examples, but I do
>> find it useful to have this stuff in findable wiki form (Google docs
>> aren't exposed to search engines afaik).
>> 
>> [1] https://queue.acm.org/detail.cfm?id=2857276 [2]
>> [2]
>> 
> https://github.com/BioSchemas/specifications/wiki/Adding-profile-specific-relations-to-BioChemEntity-and-DataRecord
>> [3]
>> [3]
>> 
> https://lists.w3.org/Archives/Public/public-bioschemas/2017Nov/0001.html
>> [4]
>> 
>> On 19/03/18 18:14, ljgarcia wrote:
>> Hi all,
>> 
>> I think we are talking about two different things here.
>> 
>> For Samples, directly using additionalProperty seems the easiest
>> option as this reduce requirements for small labs providing samples.
>> They do not have to agree on any predefined terms or properties,
>> just to provide key-value pairs via additionalProperty. Most likely,
>> they will not including information regarding a CategoryCode, this
>> one would be added whenever possible by BioSamples. @Luca, @Matt,
>> please correct me if I am wrong. For the Samples case, it is a +1 on
>> my side for accepting CategoryCode as a possible range for
>> valueReference property on PropertyValue.
>> 
>> For other groups/profiles, what Justin mentions makes sense and is
>> useful. We use that way (or an approximation,I still need to tune a
>> bit of things there) in the Protein profile.
>> 
>> What do you think? Do we have two topics here? If so, let's separate
>> them first. In any case, I will take a deeper look to Justin's
>> examples later, I got a bit lost when I saw SampleDataRecord and
>> also the schema:RangeIncludes.
>> 
>> Regards,
>> 
>> On 2018-03-19 17:47, Justin Clark-Casey wrote:
>> So, last Friday at the Samples event, Leyla, Rafa and myself were
>> talking about the alternative of specifying additional properties
>> using a second context, rather than through AdditionalProperty.  The
>> original discussion in November was at [1] but I don't think was
>> fully
>> formalized (and the example links are now broken).  But under this
>> approach, I think the above would instead be something like
>> 
>> {
>> "@context": ["http://schema.org",
>> "http://bioschemas.org/samples"],
>> "@type": ["SampleDataRecord"],
>> "diagnosisAvailable": [
>> "http://purl.bioontology.org/ontology/ICD10/C00-C97.9 [5]",
>> "http://purl.bioontology.org/ontology/ICD10/D00-D09.9 [6]"
>> ]
>> }
>> 
>> with http://bioschemas.org/samples as
>> 
>> {
>> "@context": {
>> "rdfs": "http://www.w3.org/2000/01/rdf-schema# [7]",
>> },
>> "@graph": [
>> "@id"; "http://bioschemas.org/samples",
>> {
>> "@id": "http://bioschemas.org/samples/SampleDataRecord [8]",
>> "@type": "rdfs:Class",
>> "rdfs:subClassOf": { "@id": "http://schema.org/DataRecord" }
>> }
>> {
>> "@id": "http://bioschemas.org/samples/diagnosisAvailable [9]",
>> "@type": "rdfs:Property",
>> "rdfs:label": "Diagnosis available",
>> "http://schema.org/domainIncludes [10]": [
>> {
>> "@id": "http://bioschemas.org/samples/SamplesDataRecord
>> [11]"
>> },
>> "http://schema.org/rangeIncludes [12]": [
>> {
>> "@id", "http://schema.org/URL"
>> }
>> ]
>> }
>> ]
>> }
>> 
>> See [2] for schema.org [1] [1]'s own type specification file.
>> 
>> Pros:
>> * Using existing validation tools should be easier, as this
>> definition uses standard schema.org [1] [1] mechanisms to define
>> additional properties, rather than the AdditionalProperty escape
>> hatch.
>> * Information such as name and label can go in the bioschemas.org
>> [13]
>> [7] file rather than be repeated in the data record text
>> 
>> * Easier to put in different language translations to the
>> bioschemas.org [13] [7] file
>> 
>> Cons:
>> 
>> * Applications may need to rely the URL itself (purl.org [14] [8]
>> above)
>> to retrieve information such as human-readable name for the
>> categoryCode itself (e.g. "IN SITU NEOPLASMS").  This is good
>> semantic
>> web practise I believe, but may reduce reliability.  Possibly this
>> information could also be served from http://bioschemas.org as a
>> similar set of property definitions.
>> 
>> * Perhaps not quite so easy to add arbitrary additional
>> properties,
>> though a data provider could always define and serve a third context
>> themselves, or embed it inline.
>> 
>> Thoughts?  Would especially like Leyla (though I know she's on
>> holiday), Rafa, Alasdair, Dan, etc. to weigh in.
>> 
>> [1]
>> 
> https://lists.w3.org/Archives/Public/public-bioschemas/2017Nov/thread.html
>> [15]
>> [2] https://schema.org/version/latest/schema.jsonld [16]
>> 
>> -- Justin Clark-Casey, http://justincc.org
>> 
>> Research Software Engineer, Intermine, Cambridge
>> 
>> ELIXIR UK Node technical co-orindator
>> 
>> On Mon, Mar 19, 2018 at 11:21 AM, Philippe <proccaserra@gmail.com>
>> wrote:
>> 
>> Hi Luca,
>> 
>> I am including a snippet from the notes so people can have a feel
>> for how things could look like:
>> 
>> {
>> 
>> "@context": "http://schema.org" [1],
>> 
>> "@type": ["DataRecord"],
>> 
>> "additionalProperty": [
>> 
>> {
>> 
>> "@type": "PropertyValue",
>> 
>> "name": "diagnosis_available",
>> 
>> "value": "urn:miriam:icd:C00-C97",
>> 
>> "valueReference": [
>> 
>> {
>> 
>> "@type": "CategoryCode",
>> 
>> "name": "Malignant neoplasms",
>> 
>> "url":
>> "http://purl.bioontology.org/ontology/ICD10/C00-C97.9 [5]" [2],
>> 
>> "codeValue": "C00-C97.9"
>> 
>> }
>> 
>> ]
>> 
>> },
>> 
>> {
>> 
>> "@type": "PropertyValue",
>> 
>> "name": "diagnosis_available",
>> 
>> "value": "urn:miriam:icd:D00-D09",
>> 
>> "valueReference": [
>> 
>> {
>> 
>> "@type": "CategoryCode",
>> 
>> "name": "In situ neoplasms",
>> 
>> "url":
>> "http://purl.bioontology.org/ontology/ICD10/D00-D09.9 [6]" [3],
>> 
>> "codeValue": "D00-D09.9"
>> 
>> }
>> 
>> ]
>> 
>> },
>> I also include the link the schema.org [1] [1] CategoryCode:
>> https://pending.schema.org/CategoryCode [17] [4] and their JSON-LD
>> snippet
>> 
>> * {
>> *  "@context": "http://schema.org/" [5],
>> *  "@type": "CategoryCode",
>> *  "codeValue": "Man",
>> *  "inCodeSet": "http://id.loc.gov/vocabulary/resourceTypes [18]"
>> [6]
>> * }
>> 
>> Question: Should 'inCodeSet' attribute be used instead ?
>> 
>> Best
>> 
>> Philippe
>> 
>> On 19/03/2018 11:10, Luca Cherubin wrote:
>> 
>> Hi everybody,
>> 
>> During the Hackathon event last week with various Biobanks
>> representatives we had the chance to use Bioschemas profiles and
>> types to support BioBanks use cases for metadata sharing.
>> 
>> As you may know, in the Sample profile we proposed a solution for
>> linking ontology terms to a PropertyValue using CategoryCode as
>> valid type for the valueReference field. Note that CategoryCode is
>> already a proposed schema.org [1] [1] type but in the
>> bioschemas/samples specification we propose that it should be an
>> acceptable value for valueReference.
>> 
>> To support BioBank use cases, we are using DataRecord and they
>> need to use the same CategoryCode strategy to describe all the
>> PropertyValue associated with a DataRecord.
>> 
>> In our opinion this is a very strong use case for supporting the
>> use of CategoryCode as valid type for valueReference for any
>> PropertyValue in Bioschemas/schema.org [1] [1], not only for the
>> Sample profile. We can see this being very useful in other areas
>> where there is a need for a flexible linking of ontology terms to
>> values.
>> 
>> We would like to get your feedback on this.
>> 
>> Best regards,
>> 
>> Luca and Matt
> 
> Links:
> ------
> [1] http://schema.org
> [2] http://purl.bioontology.org/ontology/ICD10/C00-C97.9 [5]
> [3] http://purl.bioontology.org/ontology/ICD10/D00-D09.9 [6]
> [4] https://pending.schema.org/CategoryCode [17]
> [5] http://schema.org/
> [6] http://id.loc.gov/vocabulary/resourceTypes [18]
> [7] http://bioschemas.org
> [8] http://purl.org
> 
> 
> 
> Links:
> ------
> [1] http://schema.org
> [2] https://queue.acm.org/detail.cfm?id=2857276
> [3]
> https://github.com/BioSchemas/specifications/wiki/Adding-profile-specific-relations-to-BioChemEntity-and-DataRecord
> [4] 
> https://lists.w3.org/Archives/Public/public-bioschemas/2017Nov/0001.html
> [5] http://purl.bioontology.org/ontology/ICD10/C00-C97.9
> [6] http://purl.bioontology.org/ontology/ICD10/D00-D09.9
> [7] http://www.w3.org/2000/01/rdf-schema#
> [8] http://bioschemas.org/samples/SampleDataRecord
> [9] http://bioschemas.org/samples/diagnosisAvailable
> [10] http://schema.org/domainIncludes
> [11] http://bioschemas.org/samples/SamplesDataRecord
> [12] http://schema.org/rangeIncludes
> [13] http://bioschemas.org
> [14] http://purl.org
> [15] 
> https://lists.w3.org/Archives/Public/public-bioschemas/2017Nov/thread.html
> [16] https://schema.org/version/latest/schema.jsonld
> [17] https://pending.schema.org/CategoryCode
> [18] http://id.loc.gov/vocabulary/resourceTypes
Received on Thursday, 22 March 2018 16:57:05 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:08:03 UTC