- From: Stephen Anyango <anyango@ebi.ac.uk>
- Date: Tue, 14 Nov 2017 11:56:31 +0000
- To: public-bioschemas@w3.org
- Message-ID: <1ec48755-cd19-801f-4ced-e2213b7fb4c5@ebi.ac.uk>
Hello, As a data provider, we are not very specific on ontology/IRI. The important thing I believe would be consistency which is easy to document and hence provide reference. Option 2 appears to be overhead on the tool developers/data consumers, and not necessarily on the data providers. Kind regards, Stephen Anyango PDBe EMBL-EBI On 14-Nov-17 11:03 AM, Leyla Garcia wrote: > Hi all, > > Would it be right to summarize our current options as the following? > > 1. There will be a unique Bioschemas context which will define > recommended aliases together with officially mandatory agreed IRIs for > all different profiles. If data providers want to use alternative > IRIs, they can do so via additionalType. Data consumers can go to any > data provider and directly get the mark up from them as all data > providers will use the agreed IRIs. > > 2. There will be a Bioschemas context template that will state the > officially mandatory agreed aliases and the recommended canonical > IRIs. Data providers can use alternative IRIs. Bioschemas will provide > a translation service that will take any mark up to the canonical > form. Data consumers would retrieve the mark up via Bioschemas > translator (otherwise they will end up with all sort of IRIs). > > I would say option 1 is what schema.org actually does. Rather than > using myOntology:citation, if I want to comply with schema.org, I use > schema:citation and so on. > > I still would like to hear some thoughts from schema.org people. as > well as from other data providers. As a data provider, we are happy > either way, we can accommodate. > > I would also suggest to Governance group to propose a data for voting > in this matter as, at some point, we have to make a decision. No > pressure, but it would be great if such decision can be reached before > the first week of December so it can be included in the poster we will > have at SWAT4LS. > > Regards, > > > On 14/11/2017 10:21, Franck Michel wrote: >> Dear all, >> >> I'd like to bring a few elements into the discussion wrt. aliases. >> >> In JSON-LD, aliases are just a handy short-cut notation with a local >> scope: an alias just applies within the scope of the context where it >> is defined. And more importantly, an alias should not bear any >> meaning. The first thing a consumer app does with JSON-LD is to >> expand all terms, which immediately removes all aliases. >> >> Hence, if I use the Bioschemas.org default context: >> @context { "Gene": { "@id": >> "http://purl.obolibrary.org/obo/SO_0000704" } ... } >> I will typically write: "@type": [ "BioChemEntity", "Gene" ] >> >> But I may well write a document with a custom alias: >> @context { "GeneAlias": { "@id": >> "http://purl.obolibrary.org/obo/SO_0000704" } ... } >> and write: "@type": [ "BioChemEntity", "GeneAlias" ] >> With: >> @context { "obo": { "@id": "http://purl.obolibrary.org/obo/" } >> ... } >> I would write: "@type": [ "BioChemEntity", "obo:SO_0000704" ] >> Or I could even not use any alias: "@type": [ "BioChemEntity", >> "http://purl.obolibrary.org/obo/SO_0000704" ] >> >> These are all equivalent from the point of view of a data consumer. >> >> In my view, the default context should be a useful guide for those >> annotating data with Bioschemas.org markup, but alias names should >> not matter at all. What matters is the URIs to which aliases resolve. >> >> I feel like the solution of agreed pre-defined URIs, whatever the >> aliases used, is more sustainable. After all, schema.org advocates >> for the use of specific agreed-upton terms. If one uses them, their >> pages are more likely to be discoverable. They can chose to use other >> terms if this is convenient for them, but then there is not guarantee >> that the pages will be discovered as easily. >> >> Franck. >> >> >> Le 13/11/2017 à 19:02, Leyla Garcia a écrit : >>> Hi all, >>> >>> Rather that relying on Bioschemas clients to do the hard work on >>> mapping, I was thinking to leave this to Bioschemas itself. So, if a >>> client wants to retrieve the, let's say, "canonical" Bioschemas >>> markup (which will use the recommended ontology terms as defined by >>> main providers for recommended and minimum properties) then this >>> client will use a Bioschemas provided tool. If a client is happy >>> with a customized Bioschemas mark up (using whichever preferred >>> ontology terms but always the predefined aliases) then this client >>> will go directly to the source. Any optional property with no alias >>> will remain as provided. Whenever possible, data providers will >>> prefer schema.org and Bioschemas named properties. >>> >>> In this way we support freedom of ontology terms choice, but also >>> support collation of information from multiple sources (soft way to >>> refer to data integration). >>> >>> How does it sound? How would that work for Bioschemas? A canonical >>> transforming tool/web service should be provided as well as servers >>> and maintenance. How would this work for schema.org/Google? Dan, via >>> Alasdair, kind of proposed the use of third-party properties. How >>> this alias-based way? >>> >>> Regards, >>> >>> On 13/11/2017 16:00, Melanie Courtot wrote: >>>> How does that currently work for schema.org, and could the same be >>>> used with Bioschemas? >>>> >>>> Looking at Bioschemas as a markup language for existing data, we >>>> should aim for the lower adoption threshold possible, including >>>> unconstrained ontology terms, keeping required properties minimal, >>>> and not having an overly complicated structure with many new >>>> properties; I worry that otherwise people will just not use it. >>>> >>>> >>>> >>>> On 10/11/2017 18:10, Justin Clark-Casey wrote: >>>>> 'Data integration' is probably too strong a phrase for what I have >>>>> in mind. I'm really thinking about discovery and how a search >>>>> engine (for example) may know/integrate that 2 different data >>>>> sources are talking about the same thing, so that the user gets >>>>> the a useful/linked set of search results. >>>>> >>>>> If a user wanted to find proteins transcribed by gene 'ABL1' >>>>> (following the examples), then I think it would be a lot simpler >>>>> if all the JSON-LD uses the term >>>>> "http://semanticscience.org/resource/is-transcribed-from". >>>>> Otherwise a search engine and maybe other applications would need >>>>> to be aware of all the mappings to other terms (I know OLS >>>>> can/will provide this but this will increase application complexity). >>>>> >>>>> I should be clear that this is thought programming on my part, I >>>>> haven't actually tried to implement anything yet :) It could well >>>>> be that there's a lot of value in sources using whatever terms are >>>>> optimal for them, and that costs of trying to co-ordinate IRIs are >>>>> too high. But I do want to debate the possible tradeoffs. >>>>> >>>>> On Fri, Nov 10, 2017 at 5:39 PM, Melanie Courtot >>>>> <mcourtot@ebi.ac.uk <mailto:mcourtot@ebi.ac.uk>> wrote: >>>>> >>>>> Is data integration really a use case for Bioschemas? The >>>>> stated goal of Bioschemas is to extend schema.org >>>>> <http://schema.org> to provide markup for pages, and IIRC the >>>>> use cases discussed at the last meeting were about discovery >>>>> and retrieval. >>>>> >>>>> Cheers, >>>>> Melanie >>>>> >>>>> >>>>> >>>>> On 10/11/2017 16:30, Justin Clark-Casey wrote: >>>>> >>>>> >>>>> >>>>> On 10/11/17 14:21, ljgarcia wrote: >>>>> >>>>> Hi, >>>>> >>>>> I thought we did not want to impose any IRI. >>>>> Is there any reason why >>>>> we should? >>>>> >>>>> >>>>> But then we sacrifice the interoperability and >>>>> understanding that we >>>>> are striving for. If you look at the n-quads for >>>>> the two examples >>>>> (included at the end of this email) then you will >>>>> see a different set >>>>> of triples. >>>>> >>>>> >>>>> If there are mappings between the terms, that >>>>> interoperability we want to achieve could still be >>>>> achieved, could not it? With mappings, we still can >>>>> transform any n-quads to the, let's say, canonical >>>>> Bioschemas defined form. Would this not be a way? If a >>>>> mapping cannot be found, then validation fails. >>>>> Bioschemas should then use mapping tools and clearly >>>>> state what the use mappings tool is. >>>>> >>>>> >>>>> If consuming applications have to use term mappings then >>>>> this will make them much harder to write, and in some >>>>> cases might make it impossible to integrate some >>>>> information. This might only be a problem for code that is >>>>> trying to integrate data across websites, but this is an >>>>> important use case. >>>>> >>>>> At least for mandatory properties and types, and major >>>>> profiles (gene, protein, etc.), I would like to see >>>>> pre-agreed IRIs, rather than free choice or emerging >>>>> consensus. In some ways, I don't think this is so >>>>> different from what we are doing with DataCatalog, Sample, >>>>> TrainingMaterial, etc. >>>>> >>>>> >>>>> Regards, >>>>> >>>>> On 2017-11-10 14:07, Gray, Alasdair J G wrote: >>>>> >>>>> On 10 Nov 2017, at 13:28, Leyla Garcia >>>>> <ljgarcia@ebi.ac.uk >>>>> <mailto:ljgarcia@ebi.ac.uk>> wrote: >>>>> I was under the same impression than Melanie. >>>>> We agree on aliases >>>>> but providers can decide what is their >>>>> preferred IRI for any of >>>>> them. A Bioschemas Protein context would just >>>>> provide a default >>>>> context that can also be used as a template >>>>> where IRIs (but not >>>>> aliases) can be modified. And of course, >>>>> anyone could add more >>>>> aliases, Bioschemas will just not parse those >>>>> outside the >>>>> default/template provided context. >>>>> >>>>> I thought we did not want to impose any IRI. >>>>> Is there any reason why >>>>> we should? >>>>> >>>>> >>>>> But then we sacrifice the interoperability and >>>>> understanding that we >>>>> are striving for. If you look at the n-quads for >>>>> the two examples >>>>> (included at the end of this email) then you will >>>>> see a different set >>>>> of triples. Aliases are only defined within the >>>>> document. When you >>>>> interpret them they give you different meanings. >>>>> If we go down this >>>>> route, we would need to make our tooling with >>>>> knowledge of either all >>>>> possible terms that will be used or mapping aware. >>>>> >>>>> Alasdair >>>>> >>>>> http://tinyurl.com/y9mu423y >>>>> >>>>> <http://identifiers.org/ncbigene/25 >>>>> <http://identifiers.org/ncbigene/25>> >>>>> <http://schema.org/name> "ABL1" . >>>>> >>>>> <http://identifiers.org/ncbigene/25 >>>>> <http://identifiers.org/ncbigene/25>> >>>>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type >>>>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>> >>>>> <http://purl.obolibrary.org/obo/SO_0000704 >>>>> <http://purl.obolibrary.org/obo/SO_0000704>> . >>>>> <http://identifiers.org/ncbigene/25 >>>>> <http://identifiers.org/ncbigene/25>> >>>>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type >>>>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>> >>>>> <http://schema.org/BioChemEntity >>>>> <http://schema.org/BioChemEntity>> . >>>>> <http://identifiers.org/uniprot/P00519 >>>>> <http://identifiers.org/uniprot/P00519>> >>>>> <http://schema.org/alternateName >>>>> <http://schema.org/alternateName>> "ABL" . >>>>> <http://identifiers.org/uniprot/P00519 >>>>> <http://identifiers.org/uniprot/P00519>> >>>>> <http://schema.org/alternateName >>>>> <http://schema.org/alternateName>> "JTK7" . >>>>> <http://identifiers.org/uniprot/P00519 >>>>> <http://identifiers.org/uniprot/P00519>> >>>>> <http://schema.org/description> "Non-receptor >>>>> tyrosine-protein kinase >>>>> that plays a role..." . >>>>> <http://identifiers.org/uniprot/P00519 >>>>> <http://identifiers.org/uniprot/P00519>> >>>>> <http://schema.org/name> >>>>> "ABL1" . >>>>> <http://identifiers.org/uniprot/P00519 >>>>> <http://identifiers.org/uniprot/P00519>> >>>>> <http://semanticscience.org/resource/SIO_000001 >>>>> <http://semanticscience.org/resource/SIO_000001>> >>>>> <http://pfam.xfam.org/clan/CL0001 >>>>> <http://pfam.xfam.org/clan/CL0001>> . >>>>> <http://identifiers.org/uniprot/P00519 >>>>> <http://identifiers.org/uniprot/P00519>> >>>>> <http://semanticscience.org/resource/SIO_010081 >>>>> <http://semanticscience.org/resource/SIO_010081>> >>>>> <http://identifiers.org/ncbigene/25 >>>>> <http://identifiers.org/ncbigene/25>> . >>>>> <http://identifiers.org/uniprot/P00519 >>>>> <http://identifiers.org/uniprot/P00519>> >>>>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type >>>>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>> >>>>> <http://purl.obolibrary.org/obo/PR_000000001 >>>>> <http://purl.obolibrary.org/obo/PR_000000001>> . >>>>> <http://identifiers.org/uniprot/P00519 >>>>> <http://identifiers.org/uniprot/P00519>> >>>>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type >>>>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>> >>>>> <http://schema.org/BioChemEntity >>>>> <http://schema.org/BioChemEntity>> . >>>>> >>>>> http://tinyurl.com/yd5snze2 >>>>> >>>>> <http://identifiers.org/ncbigene/25 >>>>> <http://identifiers.org/ncbigene/25>> >>>>> <http://schema.org/name> "ABL1" . >>>>> >>>>> <http://identifiers.org/ncbigene/25 >>>>> <http://identifiers.org/ncbigene/25>> >>>>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type >>>>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>> >>>>> <http://purl.obolibrary.org/obo/OGI_0000004 >>>>> <http://purl.obolibrary.org/obo/OGI_0000004>> . >>>>> <http://identifiers.org/ncbigene/25 >>>>> <http://identifiers.org/ncbigene/25>> >>>>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type >>>>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>> >>>>> <http://schema.org/BioChemEntity >>>>> <http://schema.org/BioChemEntity>> . >>>>> <http://identifiers.org/uniprot/P00519 >>>>> <http://identifiers.org/uniprot/P00519>> >>>>> <http://purl.obolibrary.org/obo/RO_0002510 >>>>> <http://purl.obolibrary.org/obo/RO_0002510>> >>>>> <http://identifiers.org/ncbigene/25 >>>>> <http://identifiers.org/ncbigene/25>> . >>>>> <http://identifiers.org/uniprot/P00519 >>>>> <http://identifiers.org/uniprot/P00519>> >>>>> <http://schema.org/alternateName >>>>> <http://schema.org/alternateName>> "ABL" . >>>>> <http://identifiers.org/uniprot/P00519 >>>>> <http://identifiers.org/uniprot/P00519>> >>>>> <http://schema.org/alternateName >>>>> <http://schema.org/alternateName>> "JTK7" . >>>>> <http://identifiers.org/uniprot/P00519 >>>>> <http://identifiers.org/uniprot/P00519>> >>>>> <http://schema.org/description> "Non-receptor >>>>> tyrosine-protein kinase >>>>> that plays a role..." . >>>>> <http://identifiers.org/uniprot/P00519 >>>>> <http://identifiers.org/uniprot/P00519>> >>>>> <http://schema.org/name> >>>>> "ABL1" . >>>>> <http://identifiers.org/uniprot/P00519 >>>>> <http://identifiers.org/uniprot/P00519>> >>>>> <http://semanticscience.org/resource/SIO_000001 >>>>> <http://semanticscience.org/resource/SIO_000001>> >>>>> <http://pfam.xfam.org/clan/CL0001 >>>>> <http://pfam.xfam.org/clan/CL0001>> . >>>>> <http://identifiers.org/uniprot/P00519 >>>>> <http://identifiers.org/uniprot/P00519>> >>>>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type >>>>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>> >>>>> <http://purl.obolibrary.org/obo/NCIT_C17021 >>>>> <http://purl.obolibrary.org/obo/NCIT_C17021>> . >>>>> <http://identifiers.org/uniprot/P00519 >>>>> <http://identifiers.org/uniprot/P00519>> >>>>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type >>>>> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>> >>>>> <http://schema.org/BioChemEntity >>>>> <http://schema.org/BioChemEntity>> . >>>>> >>>>> Alasdair J G Gray >>>>> >>>>> Fellow of the Higher Education Academy >>>>> Assistant Professor in Computer Science, >>>>> School of Mathematical and Computer Sciences >>>>> (Athena SWAN Bronze Award) >>>>> Heriot-Watt University, Edinburgh UK. >>>>> >>>>> Email: A.J.G.Gray@hw.ac.uk >>>>> <mailto:A.J.G.Gray@hw.ac.uk> >>>>> Web: http://www.macs.hw.ac.uk/~ajg33 >>>>> <http://www.macs.hw.ac.uk/%7Eajg33> >>>>> ORCID: http://orcid.org/0000-0002-5711-4872 >>>>> <http://orcid.org/0000-0002-5711-4872> >>>>> Office: Earl Mountbatten Building 1.39 >>>>> Twitter: @gray_alasdair >>>>> >>>>> Untitled Document .fsize { font-family: Arial, >>>>> Helvetica Neue, >>>>> Helvetica, sans-serif; font-size: 10px; } >>>>> >>>>> ------------------------- >>>>> >>>>> _HERIOT-WATT UNIVERSITY IS THE TIMES & THE SUNDAY >>>>> TIMES INTERNATIONAL >>>>> UNIVERSITY OF THE YEAR 2018_ >>>>> >>>>> Founded in 1821, Heriot-Watt is a leader in ideas >>>>> and solutions. With >>>>> campuses and students across the entire globe we >>>>> span the world, >>>>> delivering innovation and educational excellence >>>>> in business, >>>>> engineering, design and the physical, social and >>>>> life sciences. >>>>> >>>>> This email is generated from the Heriot-Watt >>>>> University Group, which >>>>> includes: >>>>> >>>>> * Heriot-Watt University, a Scottish charity >>>>> registered under number >>>>> SC000278 >>>>> * Edinburgh Business School a Charity >>>>> Registered in Scotland, >>>>> SC026900. Edinburgh Business School is a company >>>>> limited by guarantee, >>>>> registered in Scotland with registered number >>>>> SC173556 and registered >>>>> office at Heriot-Watt University Finance Office, >>>>> Riccarton, Currie, >>>>> Midlothian, EH14 4AS >>>>> * Heriot- Watt Services Limited (Oriam), >>>>> Scotland's national >>>>> performance centre for sport. Heriot-Watt Services >>>>> Limited is a >>>>> private limited company registered is Scotland >>>>> with registered number >>>>> SC271030 and registered office at Research & >>>>> Enterprise Services >>>>> Heriot-Watt University, Riccarton, Edinburgh, EH14 >>>>> 4AS. >>>>> >>>>> The contents (including any attachments) are >>>>> confidential. If you are >>>>> not the intended recipient of this e-mail, any >>>>> disclosure, copying, >>>>> distribution or use of its contents is strictly >>>>> prohibited, and you >>>>> should please notify the sender immediately and >>>>> then delete it >>>>> (including any attachments) from your system. >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>> >>> >> >
Received on Tuesday, 14 November 2017 11:56:59 UTC