Re: some more comments on Fed query doc... (was: Re: Review of Query document (basic federated query)) from Axel Polleres on 2010-10-07 (public-rdf-dawg@w3.org from October to December 2010)

From: Axel Polleres <axel.polleres@deri.org>
Date: Wed, 6 Oct 2010 22:07:01 -0400
To: SPARQL Working Group <public-rdf-dawg@w3.org>, Eric Prud'hommeaux <eric@w3.org>
Cc: Axel Polleres <axel.polleres@deri.org>
Message-Id: <E8754D97-1C10-4717-8864-5A16EB5248C6@deri.org>
And one more issue on Federation:

4)

============================================================================
"Definition: Evaluation of a Service Pattern
if IRI is a service name and vars is the set of variables in pattern P
eval(Service(IRI,P)) = Invocation( IRI, Project(P, vars) )
where Invocation(Q, vars) is an implementation of the SPARQL protocol against 
endpoint IRI, with a query Q and no default-graph-uri or named-graph-uri (see 
SPARQL Protocol [SPROT] section 2.1.1.1). if IRI is not a service name, or 
if the service returns an error, evaluation fails."
============================================================================

This is not entirely clear to me... what shall particularly the last sentence say?
Does it say, whenever one service invocation fails, the whole query fails?

This worries me particularly in the context of comment 3) below, i.e. in case of
variables in SERVICE clauses. 

I think that rather we want that an error in a service invocation just shall leave
the variables of the service pattern unbound... although it would be good to be able 
to track errors of SERVICE, maybe we could have SERVICE and SERVICE SILENT versions...
I am not sure, but I think this still needs discussion?!?

I think we probably should devote some time on cleaning up the Fed issues, or at least 
collect them all in one place, in one of the next TCs

Axel





On 5 Oct 2010, at 13:31, Axel Polleres wrote:

> Those comments are likely too late for this pub cycle since we resolved to publish, but
> just not to forget about them:
> 
> I have 2 minor ( 1) and 2) ) and one major comment 3) ...
> 
> 
> 1) In the example in section 4.1 of [1]
> 
> Join( Join( BGP(?s :p1 ?v1),
>             Service(?s :p2 ?v2) ),
>       BGP (?s :p3 ?v2) )
> 
> should be
> 
> 
> Join( Join( BGP(?s :p1 ?v1),
>             Service(<srvc>, (?s :p2 ?v2) ),
>       BGP (?s :p3 ?v2) )
> 
> right?
> 
> 2) Another minor comment (editorial): the href http://www.w3.org/TR/sparql11-federated-query/#update
>   is a bit strange, it should be
>    http://www.w3.org/TR/sparql11-federated-query/#def-bindings
>   or alike.
> 
> 3) As for the editorial note
> "Editorial note This notion of "already bound" (note the related constraint in the grammar) is still an issue for the SPARQL Working Group, as it the question of having variables in SERVICE calls at all. Feedback from the community is encouraged."
> 
>  and further in   http://www.w3.org/TR/sparql11-federated-query/#pre-bound
> 
>  "if that variable is not bound (at least optionally)"
> 
>  I think we should refer to the definition of "potentially bound" as already noted by Andy in [2].
> The draft definition is here [3]. Shall we mention [3] at least in an editor's note in Query?
> 
> However, further regarding this point 3), I think we have various options for *evaluating*
> variables in SERVICE clauses, let me try to summarise them:
> 
> 
>   a) FILTER-style: SERVICE patterns are always assumed to be evaluated *last* in a qroup, depending on the bindings provided within the
>      group, one SERVICE pattern can't provide bindings for another SERVICE pattern within the same group
> 
>      i.e.
>       - { ... ?X ... SERIVCE ?X {...} } is the same as { SERVICE ?X {...} ... ?X ...  }
>       - { ... SERVICE <a> {... ?X ...}  ... SERVICE ?X {...} } would not work unless ?X also appears outside a SERVICE pattern
>  
>   b) order-dependent a la OPTIONAL
> 
>       - { ... ?X ... SERIVCE ?X {...} } is different from { SERVICE ?X {...} ... ?X ...  }
>       - { ... SERVICE <a> {... ?X ...}  ... SERVICE ?X {...} } would mean that the bindings obtained from service <a> serve as "input"
>         for the SERVICE ?X call...
> 
>      however, then It is fairly arguable then why
> 
>      - { ... SERVICE <a> {... ?X ...}  ... SERIVCE <b> {... ?X ... } }
> 
>      shouldn't behave in the same, order-dependent way, or, respectively, why we shouldn't allow variables in BINDINGS, following the same
>      rationale of order dependence.
> 
>   c) a third alternative seems to be - analogous to our notion of Dataset - to define the notion of "service set" meaning that variables
>      in a SERIVCE position can only bind to values from that service set ... that would be analogous to the treatment of GRAPH patterns,
>      which syntactically already resemble SERVICE a lot.
> 
>      With c) it seems we can drop any restrictions on "potentially boundness",
>      however - small side note - it seems that we'd need analogous to FROM NAMED, new keywords FROM SERVICE or alike
>      to define the service set.
> 
> 
> At this moment, c) seems to me the most straightforward to define and in line with the treatment of GRAPH, whereas a) seems to be very restrictive and for b) I am afraid it raises questions ... Opinions?
> 
> Axel
> 
> 1. http://www.w3.org/2009/sparql/docs/fed/service.xml
> 2. http://lists.w3.org/Archives/Public/public-rdf-dawg/2010JulSep/0433.html
> 3. http://www.w3.org/2009/sparql/wiki/Potentially_bound
> 
> 
> On 22 Sep 2010, at 10:31, Andy Seaborne wrote:
> 
> > >> > > 13 Basic Federated Query
> > >> > > See basic federated query doc
> > Much of the Federation document is written in a very casual and
> > narrative fashion (very different than the Query doc; I suspect this
> > will be very obvious if the federation text is just merged with the
> > query document).
> >
> > The document never discusses the "UNDEF" token that is introduced in the
> > grammar.
> >
> > "Solution Mapping (corresponds to the Concepts and Abstract Syntax term
> > "RDF URI reference")" -- seems like a copy-paste typo.
> >
> > "For instance, an edpoint" -- sp. "endpoint".
> >
> > The examples are hard to follow because they are so domain-specific.
> >
> > "The mechanics of executing a query over a graph" -- is this meant to be
> > referring to "executing a query over a *named* graph"?
> >
> > "Typically, a GRAPH constraint is matched against an RDF graph which is
> > in the querying system, perhaps as the result of parsing the response to
> > an HTTP GET on the named graph." -- This is needless detail. A GRAPH
> > pattern is matched against named RDF graphs contained within the dataset
> > being used for the query.
> >
> > "GRAPH-constrained pattern" -- I don't know what this means.
> >
> > "Note that WSDL defines the behavior with respect to constructing HTTP
> > URLs from an endpoint and a set of query parameters, in particular
> > appending '?' or '&' to an endpoint URL which may already have them." --
> > I'm not totally sure what this means, but I'd like to suggest that there
> > should be a way to query over a custom dataset at the remote endpoint
> > using the standard SPARQL Protocol conventions (SERVICE
> > <http://example/endpoint?default-graph-uri=foo> {...}).
> >
> > "application/sparql-results" -- should be "application/sparql-results+xml"
> >
> > "For any other response, the query fails." -- Should this fail or just
> > return an empty result set? I can think of arguments for both, but
> > SERVICE blocks within OPTIONALS and UNIONS would be more useful if they
> > didn't cause the entire query to fail.
> >
> > "queryier" ??
> >
> > In the example for section 3 BINDINGS, the ?id variable is bound to
> > plain literals, but the example data from earlier in the document uses
> > xsd:integer typed literals.
> >
> > >> > > [FED]4.2 Definition of BINDINGS
> > "If a WhereClause has a BindingsClause" -- WhereClause doesn't 'have' a
> > BindingsClause. The grammar associates these two through SelectQuery
> > (with an intervening SolutionModifier).
> >
> > Section 4.2 doesn't seem to follow the same conventions as the query doc
> > . For example, "eval(BindingsSolutionSequence(P,V,St)) = Join(Rbc, P)"
> > -- isn't P (a GGP) an AST, not an algebra, concept?.
> >
> > >> > > [FED]5 SPARQL Federation Extensions Grammar
> > """
> > It is a syntax error if to use a variable as the first argument to a
> > ServiceGraphPattern if that variable is not bound (at least optionally)
> > in the left hand side of a join with the ServiceGraphPattern on the right.
> > """
> >
> > "if to use" -- should be "to use"
> > This text should align with Axel's(?) proposed "potentially bound"
> > concept, but in general it seems like it's trying to talk about a syntax
> > error defined in terms of the algebra which is going to be confusing for
> > people who otherwise don't need to ever think about the algebra. Also,
> > join ordering doesn't have to use the lexical ordering, so "left hand
> > side" here isn't particularly useful.
> >
> >
> 
> 
>
Received on Thursday, 7 October 2010 02:07:42 UTC