W3C home > Mailing lists > Public > public-semweb-lifesci@w3.org > May 2006

Re: proposal for standard NCBI database URI

From: Jun Zhao <zhaoj@cs.man.ac.uk>
Date: Wed, 10 May 2006 12:42:22 +0100
Message-ID: <4461D19E.5020406@cs.man.ac.uk>
To: Matthias Samwald <samwald@gmx.at>
CC: Phillip Lord <phillip.lord@newcastle.ac.uk>, public-semweb-lifesci@w3.org

Matthias Samwald wrote:

>  
>
>> Also, it's not clear what it meant by "same thing".
>>
>>
>> An genbank record and embl record identifying the same piece of DNA
>> are not the same thing; they are different records. 
>>    
>>
>
>  
>
>>Or probably "different record, but same gene, according to some criteria"
>>    
>>
>
>Clearly, there should be different URIs for different records. However, the problem discussed higher up in this thread was that there could be different URIs for EXACTLY the same record, simply because a different namespace is used, for instance. Just a single differing character is sufficient. If we do not try to make an effort to avoid even this simple problem, our ultimate goal (data/information integration) seems unreachable.  
>  
>
Yes, agreed. The real world is messy. Duplicate identities often exist 
for the same data object. In myGrid, we are interested to find where and 
how the same data was produced in different experiments, similar to what 
BioDash does. We have to build the sameAs relationship between 
ducplicate identities to live with the messy world.

>By the way, I think the notion of having 'records' that we are ascribing URIs to is something that is necessary in the transition period between current database systems and the biomedical Semantic Web, but this is not the best use we can make of the RDF and OWL standards. Ultimately, we should only have to talk about the biological things (and apply URIs to real world resources) and not the digital representations of them (and apply URIs to some database entries that have data about real world resources). The beauty of these standards is that they allow us to 'talk' and reason about the things we care about themselves. 
>
Should we have some equivalence relationships between these "records"? 
Can I say "I think the embl record is equivalent to the genbank record 
because they contain/are about  the same sequence, the real thing"? 
Different applications might define different equivalence up to your 
needs. But ontology does come into help here!

kind regards,

Jun Zhao

>The question should be 'what is the URI of the class of human insulin molecules?', not 'what is the URI of the database entry about human insulin molecules in the Uniprot database?'. Down with the unnecessary abstractions!
>
>kind regards,
>Matthias Samwald
>
>
>
>  
>
Received on Wednesday, 10 May 2006 11:42:59 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 18:00:43 GMT