- From: Michael Lang(Jr.) <michaelallenlang@gmail.com>
- Date: Tue, 5 May 2009 11:38:00 -0400
- To: Paul Gearon <gearon@ieee.org>
- Cc: Geoff Chappell <geoff@sover.net>, Nicolas Raoul <nicolas.raoul.lists@gmail.com>, semantic-web@w3.org, Alex Hall <alexhall@revelytix.com>, michaelalang@gmail.com
- Message-ID: <59c1f5620905050838h1e1325f5if6ec719beda7a3a@mail.gmail.com>
Bringing a co-worker in on the thread. Mike Lang Revelytix, Inc. phone: 410-584-0009 (office) 443-928-3782 (cell) skype: michael.allen.lang.jr aim: MikeJrRevelytix On Tue, May 5, 2009 at 11:32 AM, Paul Gearon <gearon@ieee.org> wrote: > On Tue, May 5, 2009 at 10:08 AM, Geoff Chappell <geoff@sover.net> wrote: > > <snip/> > > > > Here's an example of what I'm talking about using our Semantics.SDK for > > .NET[1]: > > > > prefix owl: <...> > > prefix ex: <...> > > prefix foaf: <...> > > > > #sparql extension to support rules > > rulebase ( > > construct {?s ?p ?o} from {?x owl:sameAs ?s. ?x ?p ?o} > > ) > > > > select ?f > > from <http://www.someplace.org/data> > > from <http://www.someotherplace.org/data> > > where { ex:Anthony foaf:knows ?f } > > > > > > To do this efficiently, the query processor will need statistics for the > > data sources used. For remote graphs (e.g. sparql endpoints) this means > that > > they either need to publish stats in a reasonable form, or the query > > processor would have to generate and cache its own based upon queries > > against the graph. > > This is exactly what I want to do in Mulgara. Unfortunately, I've > wanted to do this for a couple of years now, and there are always > other priorities. :-( > > The idea is to send the basic graph patterns out to each endpoint, and > ask how large the binding will be. The place with the largest binding > gets the entire query, and it sends out the rest of the query to the > other endpoints. Any endpoints with an empty result aren't sent > anything (for those BGPs) after their initial response. Each endpoint > can whittle down the query, sending the remaining query on to their > peers, and joining whatever result they get to their own local BGP > resolutions. The idea is that only the smallest bindings are > transferred across the network, and after join a small binding to a > large one you *usually* get another small binding (with more > variables). This is overly simplistic (really! there's a lot more to > do!), but it illustrates the point. I believe that Aduna are working > on something similar for Sesame. > > Until these things are available though, we have to reply on systems > that transfer entire graphs, or complete bindings all the time. > Mulgara can send individual bindings to various servers (either the > full list from the FROM clauses, or individually via GRAPH), which > works pretty well, but it doesn't minimize the network traffic. > > Regards, > Paul Gearon > >
Received on Tuesday, 5 May 2009 15:38:38 UTC