- From: Justin Clark-Casey <justinccdev@gmail.com>
- Date: Wed, 4 Jul 2018 19:29:24 +0100
- To: Franck Michel <fmichel@i3s.unice.fr>
- Cc: Leyla Garcia <ljgarcia@ebi.ac.uk>, Melanie Courtot <mcourtot@ebi.ac.uk>, "Gray, Alasdair J G" <A.J.G.Gray@hw.ac.uk>, public-bioschemas@w3.org
- Message-ID: <CAME9NR9ednRTxtSSF6bXnh5o+6c=zeN_C8Yky=Kg9D=+j0rv2g@mail.gmail.com>
On Fri, 29 Jun 2018 at 11:30, Franck Michel <franck.michel@cnrs.fr> wrote: > > > *Second sub-thread: How to name a profile? * Three different options are > being discussed. > (a) the context defines the profile name to be the chosen type URI, e.g. > "Protein": { "@id": "http://purl.obolibrary.org/obo/PR_000000001" > <http://purl.obolibrary.org/obo/PR_000000001> } > (b) the context defines a type within namespace http://bioschemas.org > like http://bioschemas.org/Protein. This is a hollow shell that just > denotes we're talking about a Bioschemas profile. > (c) We use the new schema.org concepts of defined term and defined term > set, such as in the example provided by Mélanie: > "@type": "DefinedTerm", > "@id": "http://purl.obolibrary.org/obo/PR_000000001", > "name": "Protein", > "inDefinedTermSet": "http://bioschemas.org/terms", > "description": "An amino acid chain that is produced de novo > by ribosome-mediated translation of a genetically-encoded mRNA.", > "sameAs": "http://purl.obolibrary.org/obo/NCIT_C17021", > "sameAs": "http://semanticscience.org/resource/SIO_010043" > > Here are a few thoughts with respect to these options: > > My concern with (a) is that a JSON-LD context is just a handy way to write > data: the string "Protein" is a sheer shorthand, it could be named anything > else. A webpage may use it this way: > "@type": "Protein" > But it would be perfectly equivalent to *not* use the context and write > this instead: > "@type": "http://purl.obolibrary.org/obo/PR_000000001" > My point is that a tool extracting Bioschemas markup should *not* rely on > the use of any specific shorthand. > Besides, doing so would force using Bioschemas with JSON-LD only, but what > about webpages using other markup formats? Unless I'm missing something > here? > This shouldn't be a concern - a JSON-LD parser would recognize these definitions as being equivalent. The @context is just there to save people having to write out full form URLs or definitions each time. As for JSON-LD, this is the single language supported by Bioschemas. However, I believe some older event markup is written up in rdfa. It shouldn't really matter as a parser can translate rdfa into the equivalent JSON-LD. > Hence, I'm more inclined to go for (b) that defines a hollow shell for > each profile such as http://bioschemas.org/Protein. The advantage is that > it will always look the same whether a webpage uses the Bioschemas context > or not. And this works the same across markup formats, JSON-LD, RDFa etc. > > (c) seems a interesting alternative. Instead of defining a JSON-LD > context, we would define a Bioschemas vocabulary by means of DefinedTerms. > For now, I don't quite understand how we would refer to the "Protein" > defined term in a webpage markup. Any clues? > Advantage: this solution avoids defining a Bioschemas profile as a type > (option (b)), which makes the distinction between a type and a profile > quite unclear. > Still, I agree with Justin that there is a need for specific code to cope > with such DefinedTerms. However, is this really an issue since, in any > case, a Bioschemas extractor tool will have to know the profiles > specifications to figure out what it looks for. Also, this is not much > different from the additionalProperty case: there has to be some specific > code to cope with it too. Right? > > Yes, I think a Bioschemas tools, such as validators, will need to recognize certain fields and analyze for cardinality, mandatory, etc. How far this needs to go may depend on the application. A search engine might largely not validate additionalType and just try and work with whatever's there. I don't think any of the profiles specify particular additionalProperties (?) so it might still be a free for all, with the more difficult findability story that this implies. I advocate (b) because it seems simpler than the alternatives, and I believe the barrier to doing Bioschemas markup has to be as low as possible. Franck. > > > Le 28/06/2018 à 19:40, Justin Clark-Casey a écrit : > > On Thu, 28 Jun 2018 at 16:42, ljgarcia <ljgarcia@ebi.ac.uk> wrote: > >> Hi, >> >> What Melanie suggests is useful to describe profiles, they would become >> a DefinedTerm. That would help as well to avoid type/profile confusion. >> We would talk then about DefinedTerms. If we find a way to also >> described the properties accepted with their restrictions, that would be >> even better. That might be a good subject for a different discussion. >> > > This means there will have to be special Bioschemas code that knows to > look in a DefinedTerm somewhere for this information. I still think using > a subtype to signify a profile will be simpler. > > I also disagree with Alasdair in that I think there should be a > http://bioschema.org/Protein type. This would be an empty type that just > signifies we're talking about a Bioschemas defined protein. so it isn't > treading on anybodies toes. This would have information saying it's > defined by http://purl.obolibrary.org/obo/PR_000000001 and it's same as > terms. Without this, there's not much point having a bioschemas context, > and requiring people to use this specific string every time is cumbersome, > especially if every group chooses something from a different ontology. > This makes writing and consuming markup harder. > > >> The question remains. How do we choose a term over others to associate >> it to a profile/DefinedTerm? >> > > I suggest having members of each specification group propose which term > they want and then come to consensus via discussion and/or vote. > > >> Regards, >> >> >> On 2018-06-28 15:45, Melanie Courtot wrote: >> > Hi, >> > >> > We could consider using the defined terms, >> > >> https://dataliberate.com/2018/06/18/schema-org-introduces-defined-terms/, >> > to do that. >> > >> > So have a protein be defined as >> > >> > "@type": "DefinedTerm", >> > "@id": "http://purl.obolibrary.org/obo/PR_000000001", >> > "name": "Protein", >> > "inDefinedTermSet": "http://bioschemas.org/terms", >> > "description": "An amino acid chain that is produced de >> > novo by ribosome-mediated translation of a genetically-encoded mRNA.", >> > "sameAs": "http://purl.obolibrary.org/obo/NCIT_C17021", >> > "sameAs": "http://semanticscience.org/resource/SIO_010043" >> > >> > (Using random examples of sameAs from >> > https://www.ebi.ac.uk/ols/search?q=protein) >> > >> > Cheers, >> > Melanie >> > >> > --- >> > Melanie Courtot, PhD >> > EMBL-EBI >> > GA4GH/BioSamples project lead >> > >> >> On 28 Jun 2018, at 15:18, ljgarcia <ljgarcia@ebi.ac.uk> wrote: >> >> Hi, >> >> >> >> I understood Franck's question in a different way. >> >> >> >> Alasdair says >> >> >> >>> I also agree that a context file should be provided which has the >> >>> chosen types and terms in it, i.e. the context file would define >> >>> Protein to be the URI http://purl.obolibrary.org/obo/PR_000000001. >> >> >> >> I think what Franck is asking is how to choose >> >> http://purl.obolibrary.org/obo/PR_000000001 over other possible >> >> terms to define a Protein. For the taxon case, same as it happens >> >> with proteins, there are multiple possibilities. Franck, is this >> >> your question? If it is, I do not think there is any agreement on >> >> how to choose, other than going for well-known ontologies broadly >> >> accepted by the community of interest, even better if the term is >> >> mapped to other possible ones. >> >> >> >> Regards, >> >> >> >> On 2018-06-28 11:50, Gray, Alasdair J G wrote: >> >> On 27 Jun 2018, at 19:19, Justin Clark-Casey <justinccdev@gmail.com> >> >> wrote: >> >> I think we should have mandatory known @types and properties. In >> >> my view, Bioschemas should be as easy as possible to write and >> >> consume. Multiple options will increase cognitive load on writers >> >> (which one do I choose? Why are these 2 examples using these >> >> different terms?) and open the door to greater inconsistency. >> >> Non-mandatory types will also raise the barriers for writing >> >> Bioschemas software that will have to be aware of equivalent >> >> mappings. >> >> I completely agree that we should have a single approved type for >> >> each profile, and likewise for each property a single chosen term. >> >> This is the whole point of having the profiles. >> >> I would go one step further and say that Bioschemas should provide >> >> an http://bioschemas.org [1] [1]context that will define types such >> >> as >> >> Taxon, rather than blessing particular ontology terms. >> >> I also agree that a context file should be provided which has the >> >> chosen types and terms in it, i.e. the context file would define >> >> Protein to be the URI http://purl.obolibrary.org/obo/PR_000000001. >> >> To >> >> be completely explicit, we would not be defining a type in the >> >> bioschemas namespace, e.g. http://bioschemas.org/Protein. >> >> This context can also document equivalent terms in different >> >> ontologies. >> >> I like the idea that this also contains mappings to the equivalent >> >> terms in other ontologies. >> >> Alasdair >> >> Alasdair J G Gray >> >> Fellow of the Higher Education Academy >> >> Assistant Professor in Computer Science, >> >> School of Mathematical and Computer Sciences >> >> (Athena SWAN Bronze Award) >> >> Heriot-Watt University, Edinburgh UK. >> >> Email: A.J.G.Gray@hw.ac.uk >> >> Web: http://www.macs.hw.ac.uk/~ajg33 >> >> ORCID: http://orcid.org/0000-0002-5711-4872 >> >> Office: Earl Mountbatten Building 1.39 >> >> Twitter: @gray_alasdair >> >> Untitled Document >> >> ------------------------- >> >> _HERIOT-WATT UNIVERSITY IS THE TIMES & THE SUNDAY TIMES >> >> INTERNATIONAL >> >> UNIVERSITY OF THE YEAR 2018_ >> >> Founded in 1821, Heriot-Watt is a leader in ideas and solutions. >> >> With >> >> campuses and students across the entire globe we span the world, >> >> delivering innovation and educational excellence in business, >> >> engineering, design and the physical, social and life sciences. >> >> This email is generated from the Heriot-Watt University Group, which >> >> includes: >> >> * Heriot-Watt University, a Scottish charity registered under >> >> number >> >> SC000278 >> >> * Edinburgh Business School a Charity Registered in Scotland, >> >> SC026900. Edinburgh Business School is a company limited by >> >> guarantee, >> >> registered in Scotland with registered number SC173556 and >> >> registered >> >> office at Heriot-Watt University Finance Office, Riccarton, Currie, >> >> Midlothian, EH14 4AS >> >> * Heriot- Watt Services Limited (Oriam), Scotland's national >> >> performance centre for sport. Heriot-Watt Services Limited is a >> >> private limited company registered is Scotland with registered >> >> number >> >> SC271030 and registered office at Research & Enterprise Services >> >> Heriot-Watt University, Riccarton, Edinburgh, EH14 4AS. >> >> The contents (including any attachments) are confidential. If you >> >> are >> >> not the intended recipient of this e-mail, any disclosure, copying, >> >> distribution or use of its contents is strictly prohibited, and you >> >> should please notify the sender immediately and then delete it >> >> (including any attachments) from your system. >> >> Links: >> >> ------ >> >> [1] http://bioschemas.org/ >> > >> > >> > >> > Links: >> > ------ >> > [1] http://bioschemas.org/ >> > >
Received on Wednesday, 4 July 2018 18:30:25 UTC