Re: Managing Co-reference (Was: A Semantic Elephant?) from Renato golin on 2008-05-14 (semantic-web@w3.org from May 2008)

From: Renato golin <renato@ebi.ac.uk>
Date: Wed, 14 May 2008 21:48:57 +0100
To: Kendall Grant Clark <kendall@clarkparsia.com>
CC: Michael F Uschold <uschold@gmail.com>, Tim Berners-Lee <timbl@w3.org>, Sören Auer <auer@informatik.uni-leipzig.de>, Chris Bizer <chris@bizer.de>, Frank van Harmelen <frank.van.harmelen@cs.vu.nl>, Kingsley Idehen <kidehen@openlinksw.com>, Semantic Web Interest Group <semantic-web@w3.org>, "Fabian M. Suchanek" <f.m.suchanek@gmail.com>, Tim Berners-Lee <timbl@csail.mit.edu>, jim hendler <hendler@cs.rpi.edu>, Mark Greaves <markg@vulcan.com>, "georgi.kobilarov" <georgi.kobilarov@gmx.de>, Jens Lehmann <lehmann@informatik.uni-leipzig.de>, Richard Cyganiak <richard@cyganiak.de>, Frederick Giasson <fred@fgiasson.com>, Michael Bergman <mike@mkbergman.com>, Conor Shankey <cshankey@reinvent.com>, Kira Oujonkova <koujonkova@reinvent.com>, Aldo Gangemi <aldo.gangemi@istc.cnr.it>
Message-ID: <482B5039.3040000@ebi.ac.uk>

Kendall Grant Clark wrote:
> You don't have to  do it at query time. Owlgres  does owl:sameAs processing at
> load time and so the *query time*  cost is negligible. The usual caveats about
> tradeoffs and use cases apply, of course.

Kendal,

You're assuming you have ALL triplets in your store. I think the 
discussion is broader and goes to all web. As Jim said: "We need to go 
beyond just triple stores and get some fast inferencing at Web scales."

I was saying that months ago in this list but no one seemed to care too 
much... We need an index that takes into account other stores, pretty 
much as we have today for routing algorithms.

All the technology we have today for storing RDF assumes all data is in 
the same database or at least in the same engine, so you can rely on 
local fast indexes, but when you start looking on remote webpages 
(personal websites included) it's obvious that you can't control what's 
in there nor how to access it.

Instead, if we had a way to say what's the probability of the 
information X about Y being in some particular direction (as in 
connections to other datasets that, in turn, connect to other datasets) 
and make those probabilities be updated whenever you find a link, we can 
then infer from where you'll go searching for that.

Of course, that would involve manual curation when you say you liked the 
result or not (network feedback) and smart algorithms to randomly choose 
to go in new directions when no result is acceptable (markov chains, 
monte carlo optimizations), but both were summarized by Michael, so I 
understand that's pretty much accepted to happen in the near future anyway.

cheers,
--renato

Received on Wednesday, 14 May 2008 20:49:35 UTC