- From: Peter Ansell <ansell.peter@gmail.com>
- Date: Thu, 15 May 2008 07:59:56 +1000
- To: renato@ebi.ac.uk
- Cc: "Kendall Grant Clark" <kendall@clarkparsia.com>, "Michael F Uschold" <uschold@gmail.com>, "Tim Berners-Lee" <timbl@w3.org>, "Sören Auer" <auer@informatik.uni-leipzig.de>, "Chris Bizer" <chris@bizer.de>, "Frank van Harmelen" <frank.van.harmelen@cs.vu.nl>, "Kingsley Idehen" <kidehen@openlinksw.com>, "Semantic Web Interest Group" <semantic-web@w3.org>, "Fabian M. Suchanek" <f.m.suchanek@gmail.com>, "Tim Berners-Lee" <timbl@csail.mit.edu>, "jim hendler" <hendler@cs.rpi.edu>, "Mark Greaves" <markg@vulcan.com>, georgi.kobilarov <georgi.kobilarov@gmx.de>, "Jens Lehmann" <lehmann@informatik.uni-leipzig.de>, "Richard Cyganiak" <richard@cyganiak.de>, "Frederick Giasson" <fred@fgiasson.com>, "Michael Bergman" <mike@mkbergman.com>, "Conor Shankey" <cshankey@reinvent.com>, "Kira Oujonkova" <koujonkova@reinvent.com>, "Aldo Gangemi" <aldo.gangemi@istc.cnr.it>
2008/5/15 Renato golin <renato@ebi.ac.uk>: > > Kendall Grant Clark wrote: >> >> You don't have to do it at query time. Owlgres does owl:sameAs >> processing at >> load time and so the *query time* cost is negligible. The usual caveats >> about >> tradeoffs and use cases apply, of course. > > Kendal, > > You're assuming you have ALL triplets in your store. I think the discussion > is broader and goes to all web. As Jim said: "We need to go beyond just > triple stores and get some fast inferencing at Web scales." Latency on the web prohibits "fast inferencing" in any sense of the term, literally. Even if everyone published in forms that were accessible via SPARQL endpoints, resolving non-trivial queries would more than likely require multiple hits to each sparql endpoint as you move further into the inference resolution for a given query. Above that you also have the fact that RDF forms an open world, and hence consistent inferencing at Web scales is a myth. Don't take that the wrong way and think that nothing can be done, just don't expect the earth if you are only going to get a small city. > I was saying that months ago in this list but no one seemed to care too > much... We need an index that takes into account other stores, pretty much > as we have today for routing algorithms. Distributed SPARQL is already practical within Quad Stores using Named Graphs within stores if you accept that the Named Graph can be resolved to an actual non-database entity. Currently the names of graphs have been assumed to be arbitrary non-meaningful URI's but what if they were not and they could actually be utilised. Kind of like the jump from the URI literal structure to resolvable URI's within RDF. If they were resolvable in their own right than you wouldn't need an index, which inevitably would cause more hassles to get it integrated and useful for everyone. > All the technology we have today for storing RDF assumes all data is in the > same database or at least in the same engine, so you can rely on local fast > indexes, but when you start looking on remote webpages (personal websites > included) it's obvious that you can't control what's in there nor how to > access it. You shouldn't either, as you aren't an authoritative source for that data. If you want to mirror the information and perform trivial cleansing procedures then it may be suitable, but you are changing the data so it is always dangerous. > Instead, if we had a way to say what's the probability of the information X > about Y being in some particular direction (as in connections to other > datasets that, in turn, connect to other datasets) and make those > probabilities be updated whenever you find a link, we can then infer from > where you'll go searching for that. Look up the Distributed SPARQL literature for some more ideas on this that have already been put forth. The whole area doesn't have to be redeveloped. > Of course, that would involve manual curation when you say you liked the > result or not (network feedback) and smart algorithms to randomly choose to > go in new directions when no result is acceptable (markov chains, monte > carlo optimizations), but both were summarized by Michael, so I understand > that's pretty much accepted to happen in the near future anyway. If the Named Graph in the QuadStore didn't resolve to an applicable source you could fallback to resolving the URI in question.
Received on Wednesday, 14 May 2008 22:00:32 UTC