W3C home > Mailing lists > Public > public-bioschemas@w3.org > March 2018

BioChemEntity.additionalType (was Re: [Proposal] CategoryCode...)

From: Justin Clark-Casey <jc955@cam.ac.uk>
Date: Thu, 22 Mar 2018 17:50:43 +0000
To: public-bioschemas@w3.org
Message-ID: <771aff9a-658e-0151-37ad-689d70550275@cam.ac.uk>
If it's 'might', should this now be an optional property rather than recommended?

Also, fyi (or Kenneth's i) on the drafts page v0.4 of proteins is still linked rather than v0.5 [2].

[1] http://bioschemas.org/specifications/drafts
[2] https://github.com/BioSchemas/specifications/blob/master/Protein/proteinProfileSpecification.html

On 22/03/18 16:56, ljgarcia wrote:
> Hi Justin,
> 
> The additionalType might be useful for data providers or consumers if they want to link to other resources not necessarily using Bioschemas. It is also a way to 
> link to your own ontology. For instance, I imagine WikiData preferring to use their types, so if they are specified as additionalType, then they can reference 
> to their protein or gene or so type in WikiData.
> 
> Cheers,
> 
> On 2018-03-22 16:11, Justin Clark-Casey wrote:
>> Thanks Leyla for the edits, much appreciated.
>>
>> I hadn't realized from the old examples that BioChemEntity was now
>> being specified through multiple inheritance directly, e.g.
>>
>> {
>>     "@context": "http://schema.org",
>>     "@type": ["BioChemEntity",
>> "http://purl.obolibrary.org/obo/PR_000000001"],
>>     "additionalType":
>> "http://semanticscience.org/resource/SIO_010043",
>>     ...
>> }
>>
>> with "http://purl.obolibrary.org/obo/PR_000000001" as the mandatory
>> string for protein.  In this case, though, what is the purpose of also
>> giving additionalType (a recommended property)?  To optionally specify
>> the type further in a less controlled manner?
>>
>> On Wed, Mar 21, 2018 at 6:53 PM, ljgarcia <ljgarcia@ebi.ac.uk> wrote:
>>
>>> Hi Justin,
>>>
>>> Thanks for this initiative, nice summary for additionalProperty and
>>> its alternative via direct reuse of properties coined in other
>>> controlled vocabularies.
>>>
>>> There were some issues regarding the use of additionalType so I made
>>> some editions to the first and second sections. Feel free to ping me
>>> if you have any questions or want to discuss further.
>>>
>>> Regards,
>>>
>>> On 2018-03-21 17:59, Justin Clark-Casey wrote:
>>> I think you're right, this could be 2 distinct things.
>>>
>>> I recently read "Schema.org: Evolution of Structured Data on the
>>> Web"
>>> [1] and it was very illuminating as to the philosophy of schema.org
>>> [1].
>>> Namely that:
>>>
>>> * Things should be much easier for the data publishers and harder
>>> for
>>> the consumers
>>> * Developers chiefly implement by adapting examples (we knew this)
>>> * Getting initial adoption is much more important than getting the
>>> structures optimal upfront.  Once there is adoption, that's
>>> justification to improve structure if necessary.
>>>
>>> So I agree with you - specifying sample relations through
>>> additionalProperty is easiest and specifying more universal
>>> per-profile relations (e.g. amino acid sequence on protein) could be
>>> done through direct additional relations to make validation easier.
>>>
>>> To get additional relations (and the general
>>> BioChemEntity/DataRecord
>>> mechanisms) more straight in my head, I published a wiki page [2].
>>> Apologies for any mistakes, please anybody feel free to edit/extend
>>> and I will do so as necessary.  I ended up repeating quite a bit of
>>> what Alasdair originally wrote [3] and what is in examples, but I do
>>> find it useful to have this stuff in findable wiki form (Google docs
>>> aren't exposed to search engines afaik).
>>>
>>> [1] https://queue.acm.org/detail.cfm?id=2857276 [2]
>>> [2]
>>>
>> https://github.com/BioSchemas/specifications/wiki/Adding-profile-specific-relations-to-BioChemEntity-and-DataRecord
>>> [3]
>>> [3]
>>>
>> https://lists.w3.org/Archives/Public/public-bioschemas/2017Nov/0001.html
>>> [4]
>>>
>>> On 19/03/18 18:14, ljgarcia wrote:
>>> Hi all,
>>>
>>> I think we are talking about two different things here.
>>>
>>> For Samples, directly using additionalProperty seems the easiest
>>> option as this reduce requirements for small labs providing samples.
>>> They do not have to agree on any predefined terms or properties,
>>> just to provide key-value pairs via additionalProperty. Most likely,
>>> they will not including information regarding a CategoryCode, this
>>> one would be added whenever possible by BioSamples. @Luca, @Matt,
>>> please correct me if I am wrong. For the Samples case, it is a +1 on
>>> my side for accepting CategoryCode as a possible range for
>>> valueReference property on PropertyValue.
>>>
>>> For other groups/profiles, what Justin mentions makes sense and is
>>> useful. We use that way (or an approximation,I still need to tune a
>>> bit of things there) in the Protein profile.
>>>
>>> What do you think? Do we have two topics here? If so, let's separate
>>> them first. In any case, I will take a deeper look to Justin's
>>> examples later, I got a bit lost when I saw SampleDataRecord and
>>> also the schema:RangeIncludes.
>>>
>>> Regards,
>>>
>>> On 2018-03-19 17:47, Justin Clark-Casey wrote:
>>> So, last Friday at the Samples event, Leyla, Rafa and myself were
>>> talking about the alternative of specifying additional properties
>>> using a second context, rather than through AdditionalProperty.  The
>>> original discussion in November was at [1] but I don't think was
>>> fully
>>> formalized (and the example links are now broken).  But under this
>>> approach, I think the above would instead be something like
>>>
>>> {
>>> "@context": ["http://schema.org",
>>> "http://bioschemas.org/samples"],
>>> "@type": ["SampleDataRecord"],
>>> "diagnosisAvailable": [
>>> "http://purl.bioontology.org/ontology/ICD10/C00-C97.9 [5]",
>>> "http://purl.bioontology.org/ontology/ICD10/D00-D09.9 [6]"
>>> ]
>>> }
>>>
>>> with http://bioschemas.org/samples as
>>>
>>> {
>>> "@context": {
>>> "rdfs": "http://www.w3.org/2000/01/rdf-schema# [7]",
>>> },
>>> "@graph": [
>>> "@id"; "http://bioschemas.org/samples",
>>> {
>>> "@id": "http://bioschemas.org/samples/SampleDataRecord [8]",
>>> "@type": "rdfs:Class",
>>> "rdfs:subClassOf": { "@id": "http://schema.org/DataRecord" }
>>> }
>>> {
>>> "@id": "http://bioschemas.org/samples/diagnosisAvailable [9]",
>>> "@type": "rdfs:Property",
>>> "rdfs:label": "Diagnosis available",
>>> "http://schema.org/domainIncludes [10]": [
>>> {
>>> "@id": "http://bioschemas.org/samples/SamplesDataRecord
>>> [11]"
>>> },
>>> "http://schema.org/rangeIncludes [12]": [
>>> {
>>> "@id", "http://schema.org/URL"
>>> }
>>> ]
>>> }
>>> ]
>>> }
>>>
>>> See [2] for schema.org [1] [1]'s own type specification file.
>>>
>>> Pros:
>>> * Using existing validation tools should be easier, as this
>>> definition uses standard schema.org [1] [1] mechanisms to define
>>> additional properties, rather than the AdditionalProperty escape
>>> hatch.
>>> * Information such as name and label can go in the bioschemas.org
>>> [13]
>>> [7] file rather than be repeated in the data record text
>>>
>>> * Easier to put in different language translations to the
>>> bioschemas.org [13] [7] file
>>>
>>> Cons:
>>>
>>> * Applications may need to rely the URL itself (purl.org [14] [8]
>>> above)
>>> to retrieve information such as human-readable name for the
>>> categoryCode itself (e.g. "IN SITU NEOPLASMS").  This is good
>>> semantic
>>> web practise I believe, but may reduce reliability.  Possibly this
>>> information could also be served from http://bioschemas.org as a
>>> similar set of property definitions.
>>>
>>> * Perhaps not quite so easy to add arbitrary additional
>>> properties,
>>> though a data provider could always define and serve a third context
>>> themselves, or embed it inline.
>>>
>>> Thoughts?  Would especially like Leyla (though I know she's on
>>> holiday), Rafa, Alasdair, Dan, etc. to weigh in.
>>>
>>> [1]
>>>
>> https://lists.w3.org/Archives/Public/public-bioschemas/2017Nov/thread.html
>>> [15]
>>> [2] https://schema.org/version/latest/schema.jsonld [16]
>>>
>>> -- Justin Clark-Casey, http://justincc.org
>>>
>>> Research Software Engineer, Intermine, Cambridge
>>>
>>> ELIXIR UK Node technical co-orindator
>>>
>>> On Mon, Mar 19, 2018 at 11:21 AM, Philippe <proccaserra@gmail.com>
>>> wrote:
>>>
>>> Hi Luca,
>>>
>>> I am including a snippet from the notes so people can have a feel
>>> for how things could look like:
>>>
>>> {
>>>
>>> "@context": "http://schema.org" [1],
>>>
>>> "@type": ["DataRecord"],
>>>
>>> "additionalProperty": [
>>>
>>> {
>>>
>>> "@type": "PropertyValue",
>>>
>>> "name": "diagnosis_available",
>>>
>>> "value": "urn:miriam:icd:C00-C97",
>>>
>>> "valueReference": [
>>>
>>> {
>>>
>>> "@type": "CategoryCode",
>>>
>>> "name": "Malignant neoplasms",
>>>
>>> "url":
>>> "http://purl.bioontology.org/ontology/ICD10/C00-C97.9 [5]" [2],
>>>
>>> "codeValue": "C00-C97.9"
>>>
>>> }
>>>
>>> ]
>>>
>>> },
>>>
>>> {
>>>
>>> "@type": "PropertyValue",
>>>
>>> "name": "diagnosis_available",
>>>
>>> "value": "urn:miriam:icd:D00-D09",
>>>
>>> "valueReference": [
>>>
>>> {
>>>
>>> "@type": "CategoryCode",
>>>
>>> "name": "In situ neoplasms",
>>>
>>> "url":
>>> "http://purl.bioontology.org/ontology/ICD10/D00-D09.9 [6]" [3],
>>>
>>> "codeValue": "D00-D09.9"
>>>
>>> }
>>>
>>> ]
>>>
>>> },
>>> I also include the link the schema.org [1] [1] CategoryCode:
>>> https://pending.schema.org/CategoryCode [17] [4] and their JSON-LD
>>> snippet
>>>
>>> * {
>>> *  "@context": "http://schema.org/" [5],
>>> *  "@type": "CategoryCode",
>>> *  "codeValue": "Man",
>>> *  "inCodeSet": "http://id.loc.gov/vocabulary/resourceTypes [18]"
>>> [6]
>>> * }
>>>
>>> Question: Should 'inCodeSet' attribute be used instead ?
>>>
>>> Best
>>>
>>> Philippe
>>>
>>> On 19/03/2018 11:10, Luca Cherubin wrote:
>>>
>>> Hi everybody,
>>>
>>> During the Hackathon event last week with various Biobanks
>>> representatives we had the chance to use Bioschemas profiles and
>>> types to support BioBanks use cases for metadata sharing.
>>>
>>> As you may know, in the Sample profile we proposed a solution for
>>> linking ontology terms to a PropertyValue using CategoryCode as
>>> valid type for the valueReference field. Note that CategoryCode is
>>> already a proposed schema.org [1] [1] type but in the
>>> bioschemas/samples specification we propose that it should be an
>>> acceptable value for valueReference.
>>>
>>> To support BioBank use cases, we are using DataRecord and they
>>> need to use the same CategoryCode strategy to describe all the
>>> PropertyValue associated with a DataRecord.
>>>
>>> In our opinion this is a very strong use case for supporting the
>>> use of CategoryCode as valid type for valueReference for any
>>> PropertyValue in Bioschemas/schema.org [1] [1], not only for the
>>> Sample profile. We can see this being very useful in other areas
>>> where there is a need for a flexible linking of ontology terms to
>>> values.
>>>
>>> We would like to get your feedback on this.
>>>
>>> Best regards,
>>>
>>> Luca and Matt
>>
>> Links:
>> ------
>> [1] http://schema.org
>> [2] http://purl.bioontology.org/ontology/ICD10/C00-C97.9 [5]
>> [3] http://purl.bioontology.org/ontology/ICD10/D00-D09.9 [6]
>> [4] https://pending.schema.org/CategoryCode [17]
>> [5] http://schema.org/
>> [6] http://id.loc.gov/vocabulary/resourceTypes [18]
>> [7] http://bioschemas.org
>> [8] http://purl.org
>>
>>
>>
>> Links:
>> ------
>> [1] http://schema.org
>> [2] https://queue.acm.org/detail.cfm?id=2857276
>> [3]
>> https://github.com/BioSchemas/specifications/wiki/Adding-profile-specific-relations-to-BioChemEntity-and-DataRecord
>> [4] https://lists.w3.org/Archives/Public/public-bioschemas/2017Nov/0001.html
>> [5] http://purl.bioontology.org/ontology/ICD10/C00-C97.9
>> [6] http://purl.bioontology.org/ontology/ICD10/D00-D09.9
>> [7] http://www.w3.org/2000/01/rdf-schema#
>> [8] http://bioschemas.org/samples/SampleDataRecord
>> [9] http://bioschemas.org/samples/diagnosisAvailable
>> [10] http://schema.org/domainIncludes
>> [11] http://bioschemas.org/samples/SamplesDataRecord
>> [12] http://schema.org/rangeIncludes
>> [13] http://bioschemas.org
>> [14] http://purl.org
>> [15] https://lists.w3.org/Archives/Public/public-bioschemas/2017Nov/thread.html
>> [16] https://schema.org/version/latest/schema.jsonld
>> [17] https://pending.schema.org/CategoryCode
>> [18] http://id.loc.gov/vocabulary/resourceTypes
> 
Received on Thursday, 22 March 2018 17:51:18 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:08:03 UTC