Re: Identifiers (was Notes from today's meeting) from Jerven Bolleman on 2013-06-04 (public-semweb-lifesci@w3.org from June 2013)

From: Jerven Bolleman <me@jerven.eu>
Date: Tue, 4 Jun 2013 14:56:17 +0200
To: Michel Dumontier <michel.dumontier@gmail.com>
Cc: N Juty <juty@ebi.ac.uk>, Joachim Baran <joachim.baran@gmail.com>, Alasdair J G Gray <Alasdair.Gray@manchester.ac.uk>, "public-semweb-lifesci@w3.org" <public-semweb-lifesci@w3.org>
Message-ID: <CAHM_hUOxvF_jEwQK=UedALT-QD6cK+Ec9OKNzdBJXiNJLusbug@mail.gmail.com>

On Tue, Jun 4, 2013 at 2:52 PM, Michel Dumontier <michel.dumontier@gmail.com
> wrote:

>
>
>
> On Tue, Jun 4, 2013 at 2:47 PM, Jerven Bolleman <me@jerven.eu> wrote:
>
>> Hi All,
>>
>> The problem with the current suggested solution is that it does not
>> account for a mix of identifiers in a dataset.
>> This is because in RDF we have resources not data items.
>>
>> So in case of uniprot.org we have lots of identifiers in a single
>> dataset. Identifying annotation, ranges, go terms etc... etc...
>> The modeling does not support that common use case of having multiple
>> identifier sources in a single dataset.
>>
>> So if I have a dataset like this .
>>
>> uniprot:P12345 a up:Protein ;
>>                        up:enzyme ec:1.2.3.4 .
>> ec:1.2.3.4 a up:Enzyme .
>>
>> How do I describe correctly the fact that a user should expect both
>> identifiers of the type 1.2.3.4 and P12345?
>>
>>
> easy. provide two matching patterns.
>
What if my data is

uniprot:1.2.3.4 a up:Protein ;
                       up:enzyme ec:P12345 .
ec:P12345 a up:Enzyme .

What if I don't have a regular expression for one of the sets? Or two very
similar ones?
e.g. mgi and pubmed?



> m.
>
>
>> Also look beyond the boundaries of life science. What happens when you
>> add geo or physical data?
>>
>> Regards,
>> Jerven
>>
>>
>>
>> On Tue, Jun 4, 2013 at 1:45 PM, N Juty <juty@ebi.ac.uk> wrote:
>>
>>> Hi,
>>>
>>> The GO abbreviations and cross-referencing list is one of a few possible
>>> lists that could be used, but there would be overlap and inconsistencies in
>>> coverage and namespace assignments, especially when using more than one
>>> such list to bridge any gaps. A lot of these lists are also 'static', with
>>> no real way to add new information.
>>>
>>> If we went down the route of a global 'authority', I would hope
>>> Identifiers.org would be a good candidate; we have gone to a lot of effort
>>> in collating data from a variety of such cross-referencing lists. Right now
>>> we are working on incorporating namespace, resource, regex information,
>>> etc. from Michel's extensive list: https://docs.google.com/**
>>> spreadsheet/ccc?key=**0AmzqhEUDpIPvdFR0UFhDUTZJdnNYd**
>>> nJwdHdvNVlJR1E#gid=0<https://docs.google.com/spreadsheet/ccc?key=0AmzqhEUDpIPvdFR0UFhDUTZJdnNYdnJwdHdvNVlJR1E#gid=0>
>>>
>>> In addition, since Identifiers.org has a dedicated curation team, we
>>> regard ourselves as being quite responsive and proactive...
>>>
>>> cheers
>>>
>>> Nick
>>>
>>>
>>>
>>>
>>>
>>> On 04/06/13 12:12, Joachim Baran wrote:
>>>
>>>> Hello,
>>>>
>>>>
>>>> On 2013-06-04, at 5:27 AM, Alasdair J G Gray <
>>>> Alasdair.Gray@manchester.ac.**uk <Alasdair.Gray@manchester.ac.uk><mailto:
>>>> Alasdair.Gray@**manchester.ac.uk <Alasdair.Gray@manchester.ac.uk>>>
>>>> wrote:
>>>>
>>>>> Again, there is a scoping problem. Prefixes are locally scoped and
>>>>> must be defined.
>>>>>
>>>>   At least in life sciences, there are the Gene Ontology abbreviations
>>>> for cross-referenced databases: http://www.geneontology.org/**
>>>> doc/GO.xrf_abbs <http://www.geneontology.org/doc/GO.xrf_abbs>
>>>>
>>>>   That document defines a wide range of prefixes, base URIs and URI
>>>> templates for resolving relevant identifiers, and provides regexps for
>>>> validating the syntax of identifiers.
>>>>
>>>>   I think that the GO xrefs are extremely useful and would be on
>>>> Michel's side on including them.
>>>>
>>>> Best wishes,
>>>> Joachim
>>>>
>>>
>>>
>>> --
>>> ------------------------------**--------------------------
>>> Nick Juty
>>> Database Curator
>>> European Bioinformatics Institute
>>> Cambridge, United Kingdom
>>> ------------------------------**--------------------------
>>>
>>>
>>
>>
>> --
>> Jerven Bolleman
>> me@jerven.eu
>>
>
>
>
> --
> Michel Dumontier
> Associate Professor of Bioinformatics, Carleton University
> Chair, W3C Semantic Web for Health Care and the Life Sciences Interest
> Group
> http://dumontierlab.com
>



-- 
Jerven Bolleman
me@jerven.eu

Received on Tuesday, 4 June 2013 12:56:48 UTC