Re: SPARQL WG Soliciting Early Reviews of Working Drafts

* Peter Ansell <ansell.peter@gmail.com> [2010-07-02 12:05+1000]
> On 2 July 2010 11:30, Lee Feigenbaum <lee@thefigtrees.net> wrote:
> > On 7/1/2010 6:44 PM, Peter Ansell wrote:
> >>
> >> The June 1 SPARQL Federation draft [1] doesn't make it clear how
> >> GRAPHS and FROM/FROM NAMED etc map to, or are omited from, Federated
> >> queries. It does say "with a query Q and no default-graph-uri or
> >> named-graph-uri" in section 4.1, but it doesn't make it clear in the
> >> examples. Is the idea is that you can't use GRAPH/FROM/FROM NAMED when
> >> you are using SERVICE.
> >>
> >> Personally, I would find it much more useful if Federation wasn't
> >> restricted to the default graph, as any number of endpoints may not
> >> have any data at all in the default graph which would make them immune
> >> to federated queries. I wouldn't like to see Federation introduced at
> >> the expense of graphs.
> >
> > Hi Peter,
> >
> > The SERVICE keyword is a way to effectively embed an invocation of the
> > SPARQL protocol within a query. The text in 4.1 specifies that the remote
> > service should be invoked without any default-graph0uri or named-graph-uri
> > parameters. The effect of this is that the remote endpoint will use its
> > default RDF dataset -- this default dataset consists of a default graph
> > (potentially empty) and zero or more named graphs.
> >
> > You can indeed use a GRAPH clause within SERVICE, and the graph pattern
> > within the GRAPH clause will be matched against the remote endpoint's named
> > graphs.
> >
> > Does this explain the situation? If so, does it address your concerns?
> 
> That does explain the note about default-graph-uri etc..
> 
> Is it also allowable to put the GRAPH outside the SERVICE pattern? The
> current syntax seems to put them on the same level as both are part of
> [49] GraphPatternNotTriples in the syntax, so the following may be
> legal?
> 
> GRAPH <http://example.com/mygraph>
> {
>   {
>     SERVICE <http://example.com/sparql>
>     {
>       ?s ?p ?o .
>     }
>  }
> UNION
> {
>     SERVICE <http://example.com/sparql2>
>     {
>       ?s ?p2 ?o2 .
>     }
>   }
> }
> 
> Adding an example to show how GRAPH's and FROM relate to the new
> SERVICE pattern would be useful. It may be useful to change the syntax
> to make sure that SERVICE will always be a top level pattern, or never
> be inside a GRAPH pattern, if that is the intention.

Two factors make this tricky:
  1. Does the implicit query have a FROM, or just a GRAPH constraint?
  2. Is GRAPH <G1> { GRAPH <G2> { ?s ?p ?o } } == GRAPH <G2> { ?s ?p ?o }?

For 1, my temptation is to say that FROM <G1> is not implied by
  GRAPH <G1> { SERVICE <S1> { … } }
; that instead the federated query should just be
  SELECT … { GRAPH <G1> { … } } # no FROM <G1>
, and let query crafters add the FROM to the service
description à la:
  GRAPH <G1> { SERVICE <S1?named-graph-uri=G1> { … } }

For 2, it's perhaps acceptable to add a transformation rule for
SERVICE queries nested inside a single GRAPH and say that in general,
doubly-nested GRAPH constraints are not defined.

GraphGraphPattern URIorVar { ... ServiceGraphPattern } => 
Service(IRI, Transform(GraphGraphPattern(URIorVar, GroupGraphPattern)))


> >> In the BINDINGS syntax, is it 'UNDEF' or 'UNBOUND'. Currently both are
> >> used but they seem to have the same meaning.
> 
> Just out of curiosity, which keyword is currently preferred here?

I've implemented UNDEF. I vaguely recall that I originally lobbied for
UNBOUND (to match some terminology in the SPARQL specification), but
that folks preferred UNDEF. I'm pretty ambivilant; what's your pref?


> >> In the section 3 example there is the variable ?human in the bindings
> >> section, and ?species in the main part of the query. Is ?human
> >> supposed to provide values for ?species, as it is not used apart from
> >> that. Seems like a typo where ?human needs to be changed to ?species.
> >
> > I'll leave these two to be fixed up by the editor in due course.

I believe the editor's draft has this fixed; the change log
  http://www.w3.org/2009/sparql/docs/fed/service#sec-cvsLog
indicates at R 1.10 .

> > thanks for the review,

indeed.

> > Lee
> 
> Another query... How does one control the paging of results from
> particular SERVICE calls? A use case would be if you know that results
> from an endpoint need to be retrieved in offsets of a particular
> number, but that restriction does not apply to other endpoints.
> 
> The reasoning for this is that some sparql endpoints, notably dbpedia,
> may legitimately reject queries if the results are going to be too
> large, so you can't rely on all results coming down in one call, or
> even coming down at all if you set the default too high. The following
> example roughly demonstrates the idea:
> 
>     SERVICE <http://dbpedia.org/sparql>
>     {
>       ?s ?p ?o .
>     } ORDER BY ?s ?p ?o PAGEVALUE 500
>     SERVICE <http://ontologies.localnetwork/sparql>
>     {
>       ?p ?p2 ?o2 .
>     } ORDER BY ?p ?p2 ?o2 PAGEVALUE 10000
> 
> If it is easier, it may be useful to extend the Service Description
> specification to include details about preferred OFFSET values and the
> maximum results ever returned by the endpoint even if you repeatedly
> page through OFFSET and LIMIT.

Hmm, do-able. 
[52] ServiceGraphPattern ::= 'SERVICE' VarOrIRIref GroupGraphPattern SolutionModifier

I guess the OrderClause in the SolutionModifier could be practical for
streaming query engines (working with the beginning of the result set
return from SERVICE while those results are still coming back from the
SERVICE service.

I'll try to implement it in the next few days to see if anything
surprises me.


> Thanks,

and thank you,

> Peter
> 

-- 
-ericP

Received on Friday, 2 July 2010 04:56:46 UTC