RE: making statements on the semantic web from Eric Neumann on 2007-08-21 (public-semweb-lifesci@w3.org from August 2007)

From: Eric Neumann <eneumann@teranode.com>
Date: Tue, 21 Aug 2007 13:41:24 -0400
To: "Eric Jain" <Eric.Jain@isb-sib.ch>, "Michel_Dumontier" <Michel_Dumontier@carleton.ca>
cc: lotus@ieee.org, gregtyrelle@phalanxbiotech.com, "public-semweb-lifesci hcls" <public-semweb-lifesci@w3.org>
Message-ID: <E7092F10DB73FA43AE0E59F2DCDAA6360A91FD@MI8NYCMAIL04.Mi8.com>

What Michel is describing is also known as a 'reverse-indexer' (which is at the heart of Google's fast retrievals). It stores all the reverse references made to any URI X, so that by asking "what is all known about X", a list of all links and annotations to it can be retrieved.

Some may recall I had proposed during one of our calls last year that a URI indexer may have some powerful benefits for given life science  community applying Semantic Web standards: e.g., the indexer would find and manage all references (e.g., annotations and cross-links) to any bioentity via URIs, and provide a fast reverse look-up.

If I may make a (personal) suggestion, once the community had a stable set of URIs for bio-entities (incl. records), it would make sense that a set of auto-indexers that traverse "enlisted" research sites for URI references to any bioentities would be of high value to everyone. This way anyone could find what has been said about anything, anywhere...

But indexers of this nature require stable URIs...

Eric

-----Original Message-----
From: public-semweb-lifesci-request@w3.org on behalf of Eric Jain
Sent: Tue 8/21/2007 1:08 PM
To: Michel_Dumontier
Cc: lotus@ieee.org; gregtyrelle@phalanxbiotech.com; public-semweb-lifesci hcls
Subject: Re: making statements on the semantic web

Michel_Dumontier wrote:
> The bigger problem, is how do we discover all the places that are making
> statements about these non-web resources?  While Bio2RDF lists a few
> equivalent resources, will it maintain this list manually? Perhaps more
> valuable is whether we entice Google to index our public triple stores,
> telling us where triples containing "uniprot:p26838" exist, thereby
> enabling distributed queries. I've brought this up a few times now, and
> I'd very much like to hear what people think...

+1

Don't know if we can rely on Google for this, but I believe there are 
already several semantic web crawler projects out there, perhaps someone 
who is more familiar with one of these projects can comment?

Bio2RDF can already be useful even without such a search engine, i.e. if 
you are looking at http://purl.uniprot.org/uniprot/P26838 and from there 
follow the InterPro link like so:

wget --header='Accept: application/rdf+xml;q=0.9,text/html;q=0.1' \
   'http://purl.uniprot.org/interpro/IPR001451'

...you'll end up with some data provided by Bio2RDF! (If instead you prefer 
text/html, you'll end up on some page on the official InterPro web site.)

Received on Tuesday, 21 August 2007 17:42:49 UTC