Re: can subqueries be executed first in SPARQL? from Peter F. Patel-Schneider on 2016-06-17 (public-sparql-dev@w3.org from April to June 2016)

From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
Date: Fri, 17 Jun 2016 14:25:55 -0700
To: james anderson <james@dydra.com>, public-sparql-dev@w3.org
Message-ID: <4f5e1dc3-0484-1d6c-cc44-f72056c24708@gmail.com>
On 06/17/2016 12:55 PM, james anderson wrote:
> good evening;
[...]

>>> we understand the recommendation differently.
>>> in particular, from the document context, it should be a “graph pattern”.
>>
>> You appear to be saying that definition of substitute should be
>>
>> *************
>> Definition: Substitute
>>
>> Let μ be a solution mapping
>>  substitute(pattern, μ) = the pattern formed by replacing every occurrence
>>    of a variable v in pattern by μ(v) for each v in dom(μ), for pattern a
>>    SPARQL algebra construct that is a graph pattern
>>  substitute(xyxy, μ) = xyxy, otherwise
>> *************
>
> that would not be adequate.
> it would be necessary to define this specific to the lexical boundary which
> applies to each form.

OK, so you don't think that the simplest fix is correct.

[As above but also requiring that substitutions only happen when the
variable is in scope according to SPARQL.]

>> This at least adds the scope rules from SPARQL into the picture.
>> Unfortunately the scope rules in SPARQL don't help here, for multiple
>> reasons, including that they do not distinguish between what might be
>> considered different variables with the same name.
>
> please elaborate.

SPARQL doesn't have the concept of different variables with the same name.
Variables in SPARQL are just tokens (of the form ?v or $v with v having a
particular syntactic form).  Whether the variable v is in-scope or not only
depends on this token, v, and not at all on which occurence of the token is
being considered.  In many programming languages you can't (just) talk about
a particular variable token being in scope, you have to talk about a
particular variable definition being in scope.

>> Or maybe substitute should drop variables from the solution mapping as soon
>> as the variable goes out of scope.  That's sounding better, but needs
>> careful consideration of just when to drop variables and what to do with,
>> for example, Projects whose variable has been substituted.
>
> we agree on the necessity to examine each type of form.

Well I agree that the effects of EXISTS *should* depend on the form of the
algebra construct that is its argument.  However, I do not agree that it is
*necessary* to do so in the current defintion of SPARQL.  On the contrary, I
argue that the current definition of SPARQL mandates that the meaning of
EXISTS is performed only by replacing variable tokens in its argument
without any consideration at all of where they occur.

>>  As well, there
>> is no notion of scope in the SPARQL algebra so this has to be either
>> defined there or scope dragged into the algebra during translation or some
>> other method for determining when to drop variables devised that matches
>> scoping.
>
> while i would agree, were the claim to have been that the definitions for the
> scope and extent of variable bindings in sparql are incomplete, i do not agree
> that there are none.
> were there none, it would be difficult for a processor to translate sparql to
> lisp, then compile that to native code and produce execution results which,
> with limited exceptions - some of which have exactly to do with rules for
> scope and extent, conform to those stipulated by the test suite.

Not at all.  The SPARQL algebra, which is what is important here, has no
notion of scope at all.  When considering expressions in Filter and Extend
there is only one multiset of solution mappings to consider when determining
how to interpret a variable.  Variables elsewhere either generate multisets
from information in a graph or are in multisets that are combined together
or generate solution sequences from multisets or solution sequences, again
without any notion of scope.  Many constructs in the SPARQL algebra
manipulate these multisets or solution sequences, but this is again without
any consideration of variable scope.

This is actually a variation on how some early LISP implementations worked.
They had a stack of variable bindings which certain constructs augmented and
was used to find the current value of a variable in a process that only
considered the variable as a token with no consideration of scope.

[...]

> is it possible to interpret a variable absent a definition of its scope and
> extent?

Yes indeed, as I have tried to explain above.

Of course, it is always possible to infer some scope and extent properties
of a language from a definition that doesn't use scope.

> 18.2.1 does not suggest that the recommendation believes that.
> their nature is one of the things which defines a language.

18.2.1 talks about variable scope.  However, all that it ends up doing is
determining when certain syntactic constructs are syntactically illegal and
determining how to translate SELECT *.  It doesn't do many of the things
that one would think of as needed for variable scoping in a programming
language, like determining whether a variable mention is a local or a global
variable.

One could say that this notion of scoping is so important for SPARQL that it
should affect the meaning of EXISTS.  For example, that only substitutions
that are in-scope for SELECT ... should be carried through into its
pattern.  I don't see any support for anything like this in the SPARQL 1.1
Query specification.

> that sparql’s conception does not agree with that of basic - or any other
> “programming language”, does not, in itself diminish the conception.
> yes, the language would be well served, should someone have the opportunity to
> work through it with a clear notion of those.
> that does not mean, they do not exist.
>
> where, with exists, you took the half full glass, filled it, and then complain
> that you do not like the blend, here you take the half full glass, empty it
> and then complain, that you were not served.

Not at all.  I took this clear part of the definition of EXISTS in the
SPARQL 1.1 Query specification, did nothing to it, and haven't complained
that it is incomplete.  (There are indeed technical problems with substitute
and I have indeed complained that the result of this part of the definition
is counter to informal wording elsewhere in the specification and also
counter to intuitions.  This is different from complaining that there is
something missing from the definition of EXISTS.)  Similarly I have not
complained at all that there is something lacking in SPARQL because it is
lacking a notion of scope.

>>> my approach, as indicated earlier, has been to employ dynamic bindings, with
>>> the intent follow the incompletely defined intent of substitution, to the
>>> extent is can be reconciled with lexical contours, as established by binding
>>> forms.
>>
>> Well I suppose that that is a possible way to implement something, but I
>> don't think that that ends up conforming to the definition of SPARQL.
>
> that depends on your definition of sparql.
> you propose one which extends beyond mine.
> the described implementation conforms to the language definition with respect
> to exists in so far as the test suite validates it.

If that is the bar that you think is sufficient to pass then sure, you pass
the test suite.  Just that remember that a test suite is a form of
debugging.  It can show that problems exist but not show that no problems exist.

> early on in this exchange, i suggested the step to formulate the tests which
> demonstrate the significant issues and add them to the w3c [SPARQL] test
suite in
> order to set milestones.
> once they exist, one can respond to your supposition.

How can one add tests to the W3C SPARQL test suite?

peter
Received on Friday, 17 June 2016 21:26:26 UTC