- From: Justin Clark-Casey <jc955@cam.ac.uk>
- Date: Wed, 21 Mar 2018 17:59:13 +0000
- To: public-bioschemas@w3.org
I think you're right, this could be 2 distinct things. I recently read "Schema.org: Evolution of Structured Data on the Web" [1] and it was very illuminating as to the philosophy of schema.org. Namely that: * Things should be much easier for the data publishers and harder for the consumers * Developers chiefly implement by adapting examples (we knew this) * Getting initial adoption is much more important than getting the structures optimal upfront. Once there is adoption, that's justification to improve structure if necessary. So I agree with you - specifying sample relations through additionalProperty is easiest and specifying more universal per-profile relations (e.g. amino acid sequence on protein) could be done through direct additional relations to make validation easier. To get additional relations (and the general BioChemEntity/DataRecord mechanisms) more straight in my head, I published a wiki page [2]. Apologies for any mistakes, please anybody feel free to edit/extend and I will do so as necessary. I ended up repeating quite a bit of what Alasdair originally wrote [3] and what is in examples, but I do find it useful to have this stuff in findable wiki form (Google docs aren't exposed to search engines afaik). [1] https://queue.acm.org/detail.cfm?id=2857276 [2] https://github.com/BioSchemas/specifications/wiki/Adding-profile-specific-relations-to-BioChemEntity-and-DataRecord [3] https://lists.w3.org/Archives/Public/public-bioschemas/2017Nov/0001.html On 19/03/18 18:14, ljgarcia wrote: > Hi all, > > I think we are talking about two different things here. > > For Samples, directly using additionalProperty seems the easiest option as this reduce requirements for small labs providing samples. They do not have to agree > on any predefined terms or properties, just to provide key-value pairs via additionalProperty. Most likely, they will not including information regarding a > CategoryCode, this one would be added whenever possible by BioSamples. @Luca, @Matt, please correct me if I am wrong. For the Samples case, it is a +1 on my > side for accepting CategoryCode as a possible range for valueReference property on PropertyValue. > > For other groups/profiles, what Justin mentions makes sense and is useful. We use that way (or an approximation,I still need to tune a bit of things there) in > the Protein profile. > > What do you think? Do we have two topics here? If so, let's separate them first. In any case, I will take a deeper look to Justin's examples later, I got a bit > lost when I saw SampleDataRecord and also the schema:RangeIncludes. > > Regards, > > > On 2018-03-19 17:47, Justin Clark-Casey wrote: >> So, last Friday at the Samples event, Leyla, Rafa and myself were >> talking about the alternative of specifying additional properties >> using a second context, rather than through AdditionalProperty. The >> original discussion in November was at [1] but I don't think was fully >> formalized (and the example links are now broken). But under this >> approach, I think the above would instead be something like >> >> { >> "@context": ["http://schema.org", >> "http://bioschemas.org/samples"], >> "@type": ["SampleDataRecord"], >> "diagnosisAvailable": [ >> "http://purl.bioontology.org/ontology/ICD10/C00-C97.9", >> "http://purl.bioontology.org/ontology/ICD10/D00-D09.9" >> ] >> } >> >> with http://bioschemas.org/samples as >> >> { >> "@context": { >> "rdfs": "http://www.w3.org/2000/01/rdf-schema#", >> }, >> "@graph": [ >> "@id"; "http://bioschemas.org/samples", >> { >> "@id": "http://bioschemas.org/samples/SampleDataRecord", >> "@type": "rdfs:Class", >> "rdfs:subClassOf": { "@id": "http://schema.org/DataRecord" } >> } >> { >> "@id": "http://bioschemas.org/samples/diagnosisAvailable", >> "@type": "rdfs:Property", >> "rdfs:label": "Diagnosis available", >> "http://schema.org/domainIncludes": [ >> { >> "@id": "http://bioschemas.org/samples/SamplesDataRecord" >> }, >> "http://schema.org/rangeIncludes": [ >> { >> "@id", "http://schema.org/URL" >> } >> ] >> } >> ] >> } >> >> See [2] for schema.org [1]'s own type specification file. >> >> Pros: >> * Using existing validation tools should be easier, as this >> definition uses standard schema.org [1] mechanisms to define >> additional properties, rather than the AdditionalProperty escape >> hatch. >> * Information such as name and label can go in the bioschemas.org >> [7] file rather than be repeated in the data record text >> >> * Easier to put in different language translations to the >> bioschemas.org [7] file >> >> Cons: >> >> * Applications may need to rely the URL itself (purl.org [8] above) >> to retrieve information such as human-readable name for the >> categoryCode itself (e.g. "IN SITU NEOPLASMS"). This is good semantic >> web practise I believe, but may reduce reliability. Possibly this >> information could also be served from http://bioschemas.org as a >> similar set of property definitions. >> >> * Perhaps not quite so easy to add arbitrary additional properties, >> though a data provider could always define and serve a third context >> themselves, or embed it inline. >> >> Thoughts? Would especially like Leyla (though I know she's on >> holiday), Rafa, Alasdair, Dan, etc. to weigh in. >> >> [1] >> https://lists.w3.org/Archives/Public/public-bioschemas/2017Nov/thread.html >> [2] https://schema.org/version/latest/schema.jsonld >> >> -- >> >> Justin Clark-Casey, http://justincc.org >> >> Research Software Engineer, Intermine, Cambridge >> >> ELIXIR UK Node technical co-orindator >> >> On Mon, Mar 19, 2018 at 11:21 AM, Philippe <proccaserra@gmail.com> >> wrote: >> >>> Hi Luca, >>> >>> I am including a snippet from the notes so people can have a feel >>> for how things could look like: >>> >>> { >>> >>> "@context": "http://schema.org" [1], >>> >>> "@type": ["DataRecord"], >>> >>> "additionalProperty": [ >>> >>> { >>> >>> "@type": "PropertyValue", >>> >>> "name": "diagnosis_available", >>> >>> "value": "urn:miriam:icd:C00-C97", >>> >>> "valueReference": [ >>> >>> { >>> >>> "@type": "CategoryCode", >>> >>> "name": "Malignant neoplasms", >>> >>> "url": >>> "http://purl.bioontology.org/ontology/ICD10/C00-C97.9" [2], >>> >>> "codeValue": "C00-C97.9" >>> >>> } >>> >>> ] >>> >>> }, >>> >>> { >>> >>> "@type": "PropertyValue", >>> >>> "name": "diagnosis_available", >>> >>> "value": "urn:miriam:icd:D00-D09", >>> >>> "valueReference": [ >>> >>> { >>> >>> "@type": "CategoryCode", >>> >>> "name": "In situ neoplasms", >>> >>> "url": >>> "http://purl.bioontology.org/ontology/ICD10/D00-D09.9" [3], >>> >>> "codeValue": "D00-D09.9" >>> >>> } >>> >>> ] >>> >>> }, >>> I also include the link the schema.org [1] CategoryCode: >>> https://pending.schema.org/CategoryCode [4] and their JSON-LD >>> snippet >>> >>> * { >>> * "@context": "http://schema.org/" [5], >>> * "@type": "CategoryCode", >>> * "codeValue": "Man", >>> * "inCodeSet": "http://id.loc.gov/vocabulary/resourceTypes" [6] >>> * } >>> >>> Question: Should 'inCodeSet' attribute be used instead ? >>> >>> Best >>> >>> Philippe >>> >>> On 19/03/2018 11:10, Luca Cherubin wrote: >>> >>>> Hi everybody, >>>> >>>> During the Hackathon event last week with various Biobanks >>>> representatives we had the chance to use Bioschemas profiles and >>>> types to support BioBanks use cases for metadata sharing. >>>> >>>> As you may know, in the Sample profile we proposed a solution for >>>> linking ontology terms to a PropertyValue using CategoryCode as >>>> valid type for the valueReference field. Note that CategoryCode is >>>> already a proposed schema.org [1] type but in the >>>> bioschemas/samples specification we propose that it should be an >>>> acceptable value for valueReference. >>>> >>>> To support BioBank use cases, we are using DataRecord and they >>>> need to use the same CategoryCode strategy to describe all the >>>> PropertyValue associated with a DataRecord. >>>> >>>> In our opinion this is a very strong use case for supporting the >>>> use of CategoryCode as valid type for valueReference for any >>>> PropertyValue in Bioschemas/schema.org [1], not only for the >>>> Sample profile. We can see this being very useful in other areas >>>> where there is a need for a flexible linking of ontology terms to >>>> values. >>>> >>>> We would like to get your feedback on this. >>>> >>>> Best regards, >>>> >>>> Luca and Matt >> >> >> >> Links: >> ------ >> [1] http://schema.org >> [2] http://purl.bioontology.org/ontology/ICD10/C00-C97.9 >> [3] http://purl.bioontology.org/ontology/ICD10/D00-D09.9 >> [4] https://pending.schema.org/CategoryCode >> [5] http://schema.org/ >> [6] http://id.loc.gov/vocabulary/resourceTypes >> [7] http://bioschemas.org >> [8] http://purl.org >
Received on Wednesday, 21 March 2018 17:59:47 UTC