Re: Missing LET (Assignment) in SPARQL 1.1 from Lee Feigenbaum on 2009-10-27 (public-rdf-dawg-comments@w3.org from October 2009)

From: Lee Feigenbaum <lee@thefigtrees.net>
Date: Mon, 26 Oct 2009 21:31:05 -0400
To: Holger Knublauch <yahoo@knublauch.com>, SPARQL Working Group Comments <public-rdf-dawg-comments@w3.org>
Message-ID: <4AE64D59.3020603@thefigtrees.net>
Holger Knublauch wrote:
> Thanks, Lee. I appreciate you taking the time to assemble all this 
> information.
> 
> I have made some experiments with the proposal to use sub-selects plus 
> project expressions in a random sample of some of my typical queries. 
> You can see three cases below. Without understanding all implications 
> from a SPARQL engine and algebra point of view, my impression is that 
> the mapping appears to be straight forward, but that it leads to very 
> verbose code. And I did not even try to find the really bad cases.
> 
> I am therefore wondering whether LET can be introduced as syntactic 
> sugar similar to some of the new OWL 2 extensions that do not change the 
> semantics but only provide additional mappings from syntax to semantics 
> - this is hopefully easier to manage for the WG?

Hi Holger,

In short, the group is free under its charter to consider this as a 
purely syntactic sugar extension, but given the process we've been 
through to this point, I don't expect it to happen.

That said, I have some comments / questions below in the interests of 
discussion :-)

> 
> Thanks,
> Holger
> 
> 
> 
> ----
> 
>  From the currency conversion example on my blog
> 
> http://composing-the-semantic-web.blogspot.com/2009/09/currency-conversion-with-units-ontology.html
> 
> The original current query is
> 
> *SELECT* ?newLiteral
> *WHERE* {
>     *LET* (?fromCurrency := datatype(?arg1)) .
>     *LET* (?rate := currencies:getRateByCurrencies(?fromCurrency, ?arg2)) .
>     *LET* (?fromValue := xsd:double(?arg1)) .
>     *LET* (?newValue := (?fromValue * ?rate)) .
>     *LET* (?newLiteral := smf:cast(?newValue, ?arg2)) .
> }
> 
> Using nested queries with well-meaning formatting would create something 
> like
> 
> *SELECT* ?newLiteral
> *WHERE* {
> {
> *SELECT* (datatype(?arg1) *AS* ?fromCurrency) 
> (xsd:double(?arg1) *AS* ?fromValue) *WHERE* {} 
> }
> {
>      
> *SELECT* (currencies:getRateByCurrencies(?fromCurrency, ?arg2) *AS* ?rate) *WHERE* {}
> }
> {
> *SELECT* ((?fromValue * ?rate) *AS* ?newValue) *WHERE* {}
> }
> {
> *SELECT* (smf:cast(?newValue, ?arg2) *AS* ?newLiteral) *WHERE* {}
> }
> }

This example actually is a good example of one thing that concerns me 
about assignment (and remember that my implementation is one that does 
support LET expressions): I'm concerned whenever a new SPARQL construct 
has an order-dependence. SPARQL is already order-dependent in cases 
involving OPTIONAL, but I prefer to keep as much of SPARQL 
order-independent as is possible. The above collection of assignments 
reads OK because of the order they're presented in, but if you switch 
the order around it's not at all clear to me what the proper algebraic 
expectations would be.

> 
> Using a single expression would be
> 
> *SELECT* (smf:cast((xsd:double(?arg1) * 
> currencies:getRateByCurrencies(datatype(?arg1), ?arg2)), ?arg2) *AS* ?newLiteral)
> *WHERE* {
> }
> 
> The example is a bit atypical because it exclusively uses LETs, and not 
> even a triple match. It also uses externally pre-bound variables. But 
> still it gives some insights.
> 
> 
> ---
> 
> In the following function body, project expressions work actually ok, 
> but keep fingers crossed that you do not have to return multiple of such 
> computed values in the SELECT:

You can project multiple expressions from a single subquery, so I'm not 
sure that's a concern?

> *SELECT* ?value
> *WHERE* {
>     *?arg2* qud:conversionMultiplier ?M1 .
>     *?arg2* qud:conversionOffset ?O1 .
>     *?arg3* qud:conversionMultiplier ?M2 .
>     *?arg3* qud:conversionOffset ?O2 .
>     *LET* (?value := ((((*?arg1* * ?M1) + ?O1) - ?O2) / ?M2)) .
> }
> 
> would become
> 
> *SELECT* (((((*?arg1* * ?M1) + ?O1) - ?O2) / ?M2) *AS* ?value)
> *WHERE* {
>     *?arg2* qud:conversionMultiplier ?M1 .
>     *?arg2* qud:conversionOffset ?O1 .
>     *?arg3* qud:conversionMultiplier ?M2 .
>     *?arg3* qud:conversionOffset ?O2 .
> }
> 
> ---
> 
> Here is an example from the SPIN box computer game, using LET in SPARQL 
> rules. This is a very typical use case actually:
> 
> # Rule1: Collect and replace diamond if possible
> *CONSTRUCT* {
>     ?diamond spinbox:replaceWith spinbox:Space .
>     ?world boulders:diamondsCollected ?newDiamondsCount .
> }
> *WHERE* {
>     ?world spinbox:field *?this* .
>     ?world spinbox:keyDirection ?direction .
>     *LET* (?diamond := spinbox:getNeighbor(*?this*, ?direction)) .
>     ?diamond a boulders:Diamond .
>     ?world spinbox:field *?this* .
>     ?world boulders:diamondsCollected ?oldDiamondsCount .
>     *LET* (?newDiamondsCount := (?oldDiamondsCount + 1)) .
> }
> 
> This would become
> 
> # Rule1: Collect and replace diamond if possible
> *CONSTRUCT* {
>     ?diamond spinbox:replaceWith spinbox:Space .
>     ?world boulders:diamondsCollected ?newDiamondsCount .
> }
> *WHERE* {
>     ?world spinbox:field *?this* .
>     ?world spinbox:keyDirection ?direction .
> {
> *SELECT* (spinbox:getNeighbor(*?this*, ?direction) *AS* ?diamond)
> *WHERE* {
> }
> }
>     ?diamond a boulders:Diamond .
>     ?world spinbox:field *?this* .
>     ?world boulders:diamondsCollected ?oldDiamondsCount .
> {
>      *SELECT* ((?oldDiamondsCount + 1) *AS* ?newDiamondsCount)
> *WHERE* {
> }
> }
> }
> 

Again, why not put both calculations in a single subquery? I don't know 
what "*?this*" is, but I'd expect this query to be much easier to read as:

CONSTRUCT {
     ?diamond spinbox:replaceWith spinbox:Space .
     ?world boulders:diamondsCollected ?newDiamondsCount .
}
WHERE {
   SELECT ?world (?oldDiamondCount + 1 AS ?newDiamongCount) 
(spinbox:getNeighbor(?this, ?direction) AS ?diamond) {
     ?world spinbox:field ?this .
     ?world spinbox:keyDirection ?direction .
     ?diamond a boulders:Diamond .
     ?world spinbox:field ?this .
     ?world boulders:diamondsCollected ?oldDiamondsCount .
   }
}

Lee

> On Oct 25, 2009, at 8:34 PM, Lee Feigenbaum wrote:
> 
>> Hi Holger,
>>
>> Thanks for the feedback. Unfortunately, assignment is not on the 
>> current Working Group's road map for standardization at this time. 
>> Here's how we got to this point:
>>
>> From roughly March through May, the WG considered around 40 potential 
>> new features[1] for the SPARQL landscape, including assignment[2]. At 
>> the time, we documented two implementations (ARQ and Open Anzo) and 
>> the support that you expressed for the feature back in March[3].
>>
>> In going through the features, the WG discussed Assignment in our 
>> March, 31 teleconference. You can see the discussion at the time at 
>> [4], the results of which was a straw poll result of 7/6/3, indicating 
>> some support and several notes of concern.
>>
>> Later in the process, we took a survey of WG member's prioritized 
>> preferences of the proposed features. Steve Harris whipped up the 
>> Condorcet results of the survey which you can see at [5]. Assignment 
>> was in the middle of the pack.
>>
>> At the group's first face-to-face meeting in May, assignment was 
>> discussed once more[6], with significant concerns expressed from 
>> Garlik and OpenLink Software, strong support from Clark & Parsia, and 
>> expressions ranging from indifference to mild support from other WG 
>> members (as I read the minutes & recollect the conversation). In the 
>> end, the group resolved to accept the list of feature proposals at 
>> [7], and Kendall's concerns about the relationship between assignment 
>> and projected expressions was addressed by Steve H at [8].
>>
>> Since then, the group has been rechartered with a specific mandate to 
>> work on the features decided during the first phase of the group's 
>> lifetime[9]. It's my hope & belief that if projected expressions do 
>> not end up fulfilling most users' needs, that implementors will extend 
>> their SPARQL implementations with assignment or a similar capability, 
>> and we will then revisit this in the next round of SPARQL standardization.
>>
>> hope this is helpful,
>> Lee
>>
>>
>> [1] http://www.w3.org/2009/sparql/wiki/Category:Features
>> [2] http://www.w3.org/2009/sparql/wiki/Feature:Assignment
>> [3] 
>> http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2009Mar/0009.html
>> [4] http://www.w3.org/2009/sparql/meeting/2009-03-31#assignment
>> [5] http://plugin.org.uk/misc/votes2.svg
>> [6] http://www.w3.org/2009/sparql/meeting/2009-05-06
>> [7] 
>> http://www.w3.org/2009/sparql/wiki/index.php?title=FeatureProposal&oldid=744 
>> <http://www.w3.org/2009/sparql/wiki/index.php?title=FeatureProposal&oldid=744> 
>>
>> [8] 
>> http://lists.w3.org/Archives/Public/public-rdf-dawg/2009AprJun/0231.html
>> [9] http://www.w3.org/2009/05/sparql-phase-II-charter
>>
>> Holger Knublauch wrote:
>>> Dear WG,
>>> reading through the drafts (great to have them already!) I am 
>>> confused about the future of Assignments (LET keyword in Jena) which 
>>> has proven to be absolutely essential for many of our customers 
>>> projects. The SPARQL 1.1 working group seems to have converged in 
>>> favor of supporting Project expressions and subqueries only, but 
>>> these IMHO fail to address the requirements below.
>>> Problem 1: How to create new values for CONSTRUCT queries
>>> Project expressions solve some problems for SELECT queries, but the 
>>> major use cases of LET have been in CONSTRUCT queries. I only see 
>>> subqueries as a (poor) way of creating new values for use in the 
>>> CONSTRUCT clause. Creating a subquery for every LET looks like a very 
>>> user unfriendly mechanism to me.
>>> Problem 2: Verbosity
>>> We often work with complex transformations such as string operations 
>>> that are best split into multiple steps. Project expressions do not 
>>> allow using intermediate variables, such as below and would force 
>>> users to chain together very long spaghetti expressions such as 
>>> SELECT (?x ex:function3(ex:function2(ex:function1(?y))). Imagine this 
>>> with some more complex expressions and it quickly becomes completely 
>>> unreadable. Also, consider you would want to reuse intermediate 
>>> values in multiple places, to avoid duplicate processing.
>>> *SELECT* ?x ?r
>>> *WHERE* {
>>> ?x ex:property ?y .
>>> *LET* (?helper1 := ex:function1(?y)) .
>>> *LET* (?helper2 := ex:function2(?helper1)) .
>>> *LET* (?r := ex:function3(?helper2)) .
>>> }
>>> The LET keyword has solved both problems nicely and in the most 
>>> general way, and would make project expressions superfluous.
>>> I would appreciate pointers to the discussions that led to the 
>>> decision to not support Assignments at this stage.
>>> Thanks
>>> Holger
>>> PS: For a parallel thread on jena-dev (with Andy's response), see
>>> http://tech.groups.yahoo.com/group/jena-dev/message/41903
>
Received on Tuesday, 27 October 2009 01:31:45 UTC