Re: Managing Co-reference (Was: A Semantic Elephant?) from Kingsley Idehen on 2008-05-15 (semantic-web@w3.org from May 2008)

From: Kingsley Idehen <kidehen@openlinksw.com>
Date: Wed, 14 May 2008 21:25:41 -0400
To: renato@ebi.ac.uk
CC: Kendall Grant Clark <kendall@clarkparsia.com>, Michael F Uschold <uschold@gmail.com>, Tim Berners-Lee <timbl@w3.org>, Sören Auer <auer@informatik.uni-leipzig.de>, Chris Bizer <chris@bizer.de>, Frank van Harmelen <frank.van.harmelen@cs.vu.nl>, Semantic Web Interest Group <semantic-web@w3.org>, "Fabian M. Suchanek" <f.m.suchanek@gmail.com>, Tim Berners-Lee <timbl@csail.mit.edu>, jim hendler <hendler@cs.rpi.edu>, Mark Greaves <markg@vulcan.com>, "georgi.kobilarov" <georgi.kobilarov@gmx.de>, Jens Lehmann <lehmann@informatik.uni-leipzig.de>, Richard Cyganiak <richard@cyganiak.de>, Frederick Giasson <fred@fgiasson.com>, Michael Bergman <mike@mkbergman.com>, Conor Shankey <cshankey@reinvent.com>, Kira Oujonkova <koujonkova@reinvent.com>, Aldo Gangemi <aldo.gangemi@istc.cnr.it>
Message-ID: <482B9115.3050509@openlinksw.com>

All,

A little diversion from the main thread.

When a private discussion becomes public we should try to use some clear 
indicator in the "Subject Line" to indicate (or remind) members of that 
the private conversation has gone public. It isn't always obvious that 
mailing lists are part of the conversation by meticulously scouring a 
lengthy cc.  list. Maybe something like [Public discussion] Semantic 
Elephant .... will do.

"Reply All' is a thing I am very guilty of when dealing with my daily 
mail based information streams.

For those who picked up my comments to Kendall re. Owlgres and Virtuoso, 
please note that I assumed a small cc. list not the entire semweb 
mailing list :-(

Kingsley
> Kendall Grant Clark wrote:
>> You don't have to  do it at query time. Owlgres  does owl:sameAs 
>> processing at
>> load time and so the *query time*  cost is negligible. The usual 
>> caveats about
>> tradeoffs and use cases apply, of course.
>
> Kendal,
>
> You're assuming you have ALL triplets in your store. I think the 
> discussion is broader and goes to all web. As Jim said: "We need to go 
> beyond just triple stores and get some fast inferencing at Web scales."
>
> I was saying that months ago in this list but no one seemed to care 
> too much... We need an index that takes into account other stores, 
> pretty much as we have today for routing algorithms.
>
> All the technology we have today for storing RDF assumes all data is 
> in the same database or at least in the same engine, so you can rely 
> on local fast indexes, but when you start looking on remote webpages 
> (personal websites included) it's obvious that you can't control 
> what's in there nor how to access it.
>
> Instead, if we had a way to say what's the probability of the 
> information X about Y being in some particular direction (as in 
> connections to other datasets that, in turn, connect to other 
> datasets) and make those probabilities be updated whenever you find a 
> link, we can then infer from where you'll go searching for that.
>
> Of course, that would involve manual curation when you say you liked 
> the result or not (network feedback) and smart algorithms to randomly 
> choose to go in new directions when no result is acceptable (markov 
> chains, monte carlo optimizations), but both were summarized by 
> Michael, so I understand that's pretty much accepted to happen in the 
> near future anyway.
>
> cheers,
> --renato
>


-- 


Regards,

Kingsley Idehen	      Weblog: http://www.openlinksw.com/blog/~kidehen
President & CEO 
OpenLink Software     Web: http://www.openlinksw.com

Received on Thursday, 15 May 2008 01:26:22 UTC