Re: Protein representation with a Bioschemas context ()

Hi all,

Rather that relying on Bioschemas clients to do the hard work on 
mapping, I was thinking to leave this to Bioschemas itself. So, if a 
client wants to retrieve the, let's say, "canonical" Bioschemas markup 
(which will use the recommended ontology terms as defined by main 
providers for recommended and minimum properties) then this client will 
use a Bioschemas provided tool. If a client is happy with a customized 
Bioschemas mark up (using whichever preferred ontology terms but always 
the predefined aliases) then this client will go directly to the source. 
Any optional property with no alias will remain as provided. Whenever 
possible, data providers will prefer schema.org and Bioschemas named 
properties.

In this way we support freedom of ontology terms choice, but also 
support collation of information from multiple sources (soft way to 
refer to data integration).

How does it sound? How would that work for Bioschemas? A canonical 
transforming tool/web service should be provided as well as servers and 
maintenance. How would this work for schema.org/Google? Dan, via 
Alasdair, kind of proposed the use of third-party properties. How this 
alias-based way?

Regards,

On 13/11/2017 16:00, Melanie Courtot wrote:
> How does that currently work for schema.org, and could the same be 
> used with Bioschemas?
>
> Looking at Bioschemas as a markup language for existing data, we 
> should aim for the lower adoption threshold possible, including 
> unconstrained ontology terms, keeping required properties minimal, and 
> not having an overly complicated structure with many new properties; I 
> worry that otherwise people will just not use it.
>
>
>
> On 10/11/2017 18:10, Justin Clark-Casey wrote:
>> 'Data integration' is probably too strong a phrase for what I have in 
>> mind.  I'm really thinking about discovery and how a search engine 
>> (for example) may know/integrate that 2 different data sources are 
>> talking about the same thing, so that the user gets the a 
>> useful/linked set of search results.
>>
>> If a user wanted to find proteins transcribed by gene 'ABL1' 
>> (following the examples), then I think it would be a lot simpler if 
>> all the JSON-LD uses the term
>> "http://semanticscience.org/resource/is-transcribed-from".  Otherwise 
>> a search engine and maybe other applications would need to be aware 
>> of all the mappings to other terms (I know OLS can/will provide this 
>> but this will increase application complexity).
>>
>> I should be clear that this is thought programming on my part, I 
>> haven't actually tried to implement anything yet :)  It could well be 
>> that there's a lot of value in sources using whatever terms are 
>> optimal for them, and that costs of trying to co-ordinate IRIs are 
>> too high.  But I do want to debate the possible tradeoffs.
>>
>> On Fri, Nov 10, 2017 at 5:39 PM, Melanie Courtot <mcourtot@ebi.ac.uk 
>> <mailto:mcourtot@ebi.ac.uk>> wrote:
>>
>>     Is data integration really a use case for Bioschemas?  The stated
>>     goal of Bioschemas is to extend schema.org <http://schema.org> to
>>     provide markup for pages, and IIRC the use cases discussed at the
>>     last meeting were about discovery and retrieval.
>>
>>     Cheers,
>>     Melanie
>>
>>
>>
>>     On 10/11/2017 16:30, Justin Clark-Casey wrote:
>>
>>
>>
>>         On 10/11/17 14:21, ljgarcia wrote:
>>
>>             Hi,
>>
>>                     I thought we did not want to impose any IRI. Is
>>                     there any reason why
>>                     we should?
>>
>>
>>                 But then we sacrifice the interoperability and
>>                 understanding that we
>>                 are striving for. If you look at the n-quads for the
>>                 two examples
>>                 (included at the end of this email) then you will see
>>                 a different set
>>                 of triples.
>>
>>
>>             If there are mappings between the terms, that
>>             interoperability we want to achieve could still be
>>             achieved, could not it? With mappings, we still can
>>             transform any n-quads to the, let's say, canonical
>>             Bioschemas defined form. Would this not be a way? If a
>>             mapping cannot be found, then validation fails.
>>             Bioschemas should then use mapping tools and clearly
>>             state what the use mappings tool is.
>>
>>
>>         If consuming applications have to use term mappings then this
>>         will make them much harder to write, and in some cases might
>>         make it impossible to integrate some information.  This might
>>         only be a problem for code that is trying to integrate data
>>         across websites, but this is an important use case.
>>
>>         At least for mandatory properties and types, and major
>>         profiles (gene, protein, etc.), I would like to see
>>         pre-agreed IRIs, rather than free choice or emerging
>>         consensus.  In some ways, I don't think this is so different
>>         from what we are doing with DataCatalog, Sample,
>>         TrainingMaterial, etc.
>>
>>
>>             Regards,
>>
>>             On 2017-11-10 14:07, Gray, Alasdair J G wrote:
>>
>>                     On 10 Nov 2017, at 13:28, Leyla Garcia
>>                     <ljgarcia@ebi.ac.uk <mailto:ljgarcia@ebi.ac.uk>>
>>                     wrote:
>>                     I was under the same impression than Melanie. We
>>                     agree on aliases
>>                     but providers can decide what is their preferred
>>                     IRI for any of
>>                     them. A Bioschemas Protein context would just
>>                     provide a default
>>                     context that can also be used as a template where
>>                     IRIs (but not
>>                     aliases) can be modified. And of course, anyone
>>                     could add more
>>                     aliases, Bioschemas will just not parse those
>>                     outside the
>>                     default/template provided context.
>>
>>                     I thought we did not want to impose any IRI. Is
>>                     there any reason why
>>                     we should?
>>
>>
>>                 But then we sacrifice the interoperability and
>>                 understanding that we
>>                 are striving for. If you look at the n-quads for the
>>                 two examples
>>                 (included at the end of this email) then you will see
>>                 a different set
>>                 of triples. Aliases are only defined within the
>>                 document. When you
>>                 interpret them they give you different meanings. If
>>                 we go down this
>>                 route, we would need to make our tooling with
>>                 knowledge of either all
>>                 possible terms that will be used or mapping aware.
>>
>>                 Alasdair
>>
>>                 http://tinyurl.com/y9mu423y
>>
>>                 <http://identifiers.org/ncbigene/25
>>                 <http://identifiers.org/ncbigene/25>>
>>                 <http://schema.org/name> "ABL1" .
>>
>>                 <http://identifiers.org/ncbigene/25
>>                 <http://identifiers.org/ncbigene/25>>
>>                 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type
>>                 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>>
>>                 <http://purl.obolibrary.org/obo/SO_0000704
>>                 <http://purl.obolibrary.org/obo/SO_0000704>> .
>>                 <http://identifiers.org/ncbigene/25
>>                 <http://identifiers.org/ncbigene/25>>
>>                 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type
>>                 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>>
>>                 <http://schema.org/BioChemEntity
>>                 <http://schema.org/BioChemEntity>> .
>>                 <http://identifiers.org/uniprot/P00519
>>                 <http://identifiers.org/uniprot/P00519>>
>>                 <http://schema.org/alternateName
>>                 <http://schema.org/alternateName>> "ABL" .
>>                 <http://identifiers.org/uniprot/P00519
>>                 <http://identifiers.org/uniprot/P00519>>
>>                 <http://schema.org/alternateName
>>                 <http://schema.org/alternateName>> "JTK7" .
>>                 <http://identifiers.org/uniprot/P00519
>>                 <http://identifiers.org/uniprot/P00519>>
>>                 <http://schema.org/description> "Non-receptor
>>                 tyrosine-protein kinase
>>                 that plays a role..." .
>>                 <http://identifiers.org/uniprot/P00519
>>                 <http://identifiers.org/uniprot/P00519>>
>>                 <http://schema.org/name>
>>                 "ABL1" .
>>                 <http://identifiers.org/uniprot/P00519
>>                 <http://identifiers.org/uniprot/P00519>>
>>                 <http://semanticscience.org/resource/SIO_000001
>>                 <http://semanticscience.org/resource/SIO_000001>>
>>                 <http://pfam.xfam.org/clan/CL0001
>>                 <http://pfam.xfam.org/clan/CL0001>> .
>>                 <http://identifiers.org/uniprot/P00519
>>                 <http://identifiers.org/uniprot/P00519>>
>>                 <http://semanticscience.org/resource/SIO_010081
>>                 <http://semanticscience.org/resource/SIO_010081>>
>>                 <http://identifiers.org/ncbigene/25
>>                 <http://identifiers.org/ncbigene/25>> .
>>                 <http://identifiers.org/uniprot/P00519
>>                 <http://identifiers.org/uniprot/P00519>>
>>                 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type
>>                 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>>
>>                 <http://purl.obolibrary.org/obo/PR_000000001
>>                 <http://purl.obolibrary.org/obo/PR_000000001>> .
>>                 <http://identifiers.org/uniprot/P00519
>>                 <http://identifiers.org/uniprot/P00519>>
>>                 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type
>>                 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>>
>>                 <http://schema.org/BioChemEntity
>>                 <http://schema.org/BioChemEntity>> .
>>
>>                 http://tinyurl.com/yd5snze2
>>
>>                 <http://identifiers.org/ncbigene/25
>>                 <http://identifiers.org/ncbigene/25>>
>>                 <http://schema.org/name> "ABL1" .
>>
>>                 <http://identifiers.org/ncbigene/25
>>                 <http://identifiers.org/ncbigene/25>>
>>                 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type
>>                 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>>
>>                 <http://purl.obolibrary.org/obo/OGI_0000004
>>                 <http://purl.obolibrary.org/obo/OGI_0000004>> .
>>                 <http://identifiers.org/ncbigene/25
>>                 <http://identifiers.org/ncbigene/25>>
>>                 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type
>>                 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>>
>>                 <http://schema.org/BioChemEntity
>>                 <http://schema.org/BioChemEntity>> .
>>                 <http://identifiers.org/uniprot/P00519
>>                 <http://identifiers.org/uniprot/P00519>>
>>                 <http://purl.obolibrary.org/obo/RO_0002510
>>                 <http://purl.obolibrary.org/obo/RO_0002510>>
>>                 <http://identifiers.org/ncbigene/25
>>                 <http://identifiers.org/ncbigene/25>> .
>>                 <http://identifiers.org/uniprot/P00519
>>                 <http://identifiers.org/uniprot/P00519>>
>>                 <http://schema.org/alternateName
>>                 <http://schema.org/alternateName>> "ABL" .
>>                 <http://identifiers.org/uniprot/P00519
>>                 <http://identifiers.org/uniprot/P00519>>
>>                 <http://schema.org/alternateName
>>                 <http://schema.org/alternateName>> "JTK7" .
>>                 <http://identifiers.org/uniprot/P00519
>>                 <http://identifiers.org/uniprot/P00519>>
>>                 <http://schema.org/description> "Non-receptor
>>                 tyrosine-protein kinase
>>                 that plays a role..." .
>>                 <http://identifiers.org/uniprot/P00519
>>                 <http://identifiers.org/uniprot/P00519>>
>>                 <http://schema.org/name>
>>                 "ABL1" .
>>                 <http://identifiers.org/uniprot/P00519
>>                 <http://identifiers.org/uniprot/P00519>>
>>                 <http://semanticscience.org/resource/SIO_000001
>>                 <http://semanticscience.org/resource/SIO_000001>>
>>                 <http://pfam.xfam.org/clan/CL0001
>>                 <http://pfam.xfam.org/clan/CL0001>> .
>>                 <http://identifiers.org/uniprot/P00519
>>                 <http://identifiers.org/uniprot/P00519>>
>>                 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type
>>                 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>>
>>                 <http://purl.obolibrary.org/obo/NCIT_C17021
>>                 <http://purl.obolibrary.org/obo/NCIT_C17021>> .
>>                 <http://identifiers.org/uniprot/P00519
>>                 <http://identifiers.org/uniprot/P00519>>
>>                 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type
>>                 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>>
>>                 <http://schema.org/BioChemEntity
>>                 <http://schema.org/BioChemEntity>> .
>>
>>                 Alasdair J G Gray
>>
>>                  Fellow of the Higher Education Academy
>>                 Assistant Professor in Computer Science,
>>                 School of Mathematical and Computer Sciences
>>                 (Athena SWAN Bronze Award)
>>                 Heriot-Watt University, Edinburgh UK.
>>
>>                 Email: A.J.G.Gray@hw.ac.uk <mailto:A.J.G.Gray@hw.ac.uk>
>>                 Web: http://www.macs.hw.ac.uk/~ajg33
>>                 <http://www.macs.hw.ac.uk/%7Eajg33>
>>                 ORCID: http://orcid.org/0000-0002-5711-4872
>>                 <http://orcid.org/0000-0002-5711-4872>
>>                 Office: Earl Mountbatten Building 1.39
>>                 Twitter: @gray_alasdair
>>
>>                  Untitled Document .fsize { font-family: Arial,
>>                 Helvetica Neue,
>>                 Helvetica, sans-serif; font-size: 10px; }
>>
>>                 -------------------------
>>
>>                 _HERIOT-WATT UNIVERSITY IS THE TIMES & THE SUNDAY
>>                 TIMES INTERNATIONAL
>>                 UNIVERSITY OF THE YEAR 2018_
>>
>>                 Founded in 1821, Heriot-Watt is a leader in ideas and
>>                 solutions. With
>>                 campuses and students across the entire globe we span
>>                 the world,
>>                 delivering innovation and educational excellence in
>>                 business,
>>                 engineering, design and the physical, social and life
>>                 sciences.
>>
>>                 This email is generated from the Heriot-Watt
>>                 University Group, which
>>                 includes:
>>
>>                      * Heriot-Watt University, a Scottish charity
>>                 registered under number
>>                 SC000278
>>                     * Edinburgh Business School a Charity Registered
>>                 in Scotland,
>>                 SC026900. Edinburgh Business School is a company
>>                 limited by guarantee,
>>                 registered in Scotland with registered number
>>                 SC173556 and registered
>>                 office at Heriot-Watt University Finance Office,
>>                 Riccarton, Currie,
>>                 Midlothian, EH14 4AS
>>                     * Heriot- Watt Services Limited (Oriam),
>>                 Scotland's national
>>                 performance centre for sport. Heriot-Watt Services
>>                 Limited is a
>>                 private limited company registered is Scotland with
>>                 registered number
>>                 SC271030 and registered office at Research &
>>                 Enterprise Services
>>                 Heriot-Watt University, Riccarton, Edinburgh, EH14 4AS.
>>
>>                 The contents (including any attachments) are
>>                 confidential. If you are
>>                 not the intended recipient of this e-mail, any
>>                 disclosure, copying,
>>                 distribution or use of its contents is strictly
>>                 prohibited, and you
>>                 should please notify the sender immediately and then
>>                 delete it
>>                 (including any attachments) from your system.
>>
>>
>>
>>
>>
>>
>

Received on Monday, 13 November 2017 18:02:33 UTC