Re: proposal for standard NCBI database URI

On May 10, 2006, at 6:32 AM, Xiaoshu Wang wrote:

>
> --Phil,
>
>> Also, it's not clear what it meant by "same thing".
>>
>> An genbank record and embl record identifying the same piece
>> of DNA are not the same thing; they are different records.
>> Given that this is the semantic web, it might be nice to be
>> able to state "different records, but same gene". Or probably
>> "different record, but same gene, according to some criteria".
>
> Using a "record"'s URI to identify a gene is fundamentally wrong.  
> The nature
> of a "record" is a text document but the nature of a gene is a  
> biological
> entity.  Mixing the two of course will generate confusion.  W3C's  
> TAG group
> has already tackled this issue for quite a while  and they have  
> come up a
> resonably good resolution at the end of last year (search issue
> httpRange-14).
>
> As I said before, how the URI looks like doesn't matter. What  
> matters is
> what will be returned when the URI is dereferenced.  The URI is  
> just like
> the variable identifier of a programming language. Each variable  
> has its own
> type.  If try to use a Foo as a Bar, of course you get runtime error.
>
> In the mentioned particular case, a Gene is a biological entity  
> where a
> Genebank record is an electronic text document.  Of course, you  
> should not
> use the latter to identify the former.  The gene should have its  
> own URI.

Genes should have their own URIs? That's some 10^16 or so URIs just  
for the volume of space that I'm occupying right now.

More useful would be a URI for gene types - eg a URI for the type  
"Homo sapiens p53 gene" (or an allele thereof).

Of course, this gets back to Phil's point about not being able to  
define gene, species, etc. You could counter that an instance of a  
gene is akin to an instance of a species and is some aggregation- 
population like entity, and there is only one Homo sapiens p53 gene.  
But that leaves open the question of the relation between that entity  
and the 10^12 or so DNA regions encoding p53 proteins in my cells.

I agree with Matthias that it is not as hopeless as Phil makes out -  
I don't think it's so hard to come up with commensurable definitions  
for these things.

> For example, if it is assigned to be http://example.com/gene/123. And
> dereference this URI should eventually (i.e., perhaps after a HTTP  
> 303, see
> httpRange-14) lead to an RDF document, where it says
>
> [http://example.com/gene/123] a rdfs:Resource; (Or a URI for Gene  
> from an
> ontology ...)
>     rdfs:seeAlso [GeneBank record ID];
>     rdfs:seeAlso [EMBL record ID].

You could use the sequence ontology here

but if we were to treat genes as types (which is ontologically  
correct, I would argue), then the relation between 123 and SO:gene  
would be subClass, not instantiation. I'm not sure why you couldn't  
have an owl:sameAs between 123 and, say, an NCBI Gene ID. Both URIs  
would dereference to representations of types.

Cheers
Chris

>
> If you want, you can further write:
>
>     [Genebank record ID] owl:sameAs [EMBL record ID].
>
> or whatever you want to say about anyting in the world.



>
> Xiaoshu
>
>

Received on Wednesday, 10 May 2006 20:53:31 UTC