RE: proposal for standard NCBI database URI

--Phil,

> Also, it's not clear what it meant by "same thing". 
> 
> An genbank record and embl record identifying the same piece 
> of DNA are not the same thing; they are different records. 
> Given that this is the semantic web, it might be nice to be 
> able to state "different records, but same gene". Or probably 
> "different record, but same gene, according to some criteria". 

Using a "record"'s URI to identify a gene is fundamentally wrong. The nature
of a "record" is a text document but the nature of a gene is a biological
entity.  Mixing the two of course will generate confusion.  W3C's TAG group
has already tackled this issue for quite a while  and they have come up a
resonably good resolution at the end of last year (search issue
httpRange-14).

As I said before, how the URI looks like doesn't matter. What matters is
what will be returned when the URI is dereferenced.  The URI is just like
the variable identifier of a programming language. Each variable has its own
type.  If try to use a Foo as a Bar, of course you get runtime error.

In the mentioned particular case, a Gene is a biological entity where a
Genebank record is an electronic text document.  Of course, you should not
use the latter to identify the former.  The gene should have its own URI.
For example, if it is assigned to be http://example.com/gene/123. And
dereference this URI should eventually (i.e., perhaps after a HTTP 303, see
httpRange-14) lead to an RDF document, where it says

[http://example.com/gene/123] a rdfs:Resource; (Or a URI for Gene from an
ontology ...)
    rdfs:seeAlso [GeneBank record ID];
    rdfs:seeAlso [EMBL record ID].

If you want, you can further write:

    [Genebank record ID] owl:sameAs [EMBL record ID].

or whatever you want to say about anyting in the world.

Xiaoshu

Received on Wednesday, 10 May 2006 13:33:05 UTC