Re: Passing distinct subquery solutions to aggregate outer query from Paul Tyson on 2013-01-24 (public-sparql-dev@w3.org from January to March 2013)

From: Paul Tyson <phtyson@sbcglobal.net>
Date: Thu, 24 Jan 2013 10:54:30 -0600
To: Paul Tyson <phtyson@sbcglobal.net>
Cc: Andy Seaborne <andy.seaborne@epimorphics.com>, "public-sparql-dev@w3.org" <public-sparql-dev@w3.org>
Message-Id: <AF9AC940-CCE8-419B-8522-5F8F00F6743A@sbcglobal.net>

Ok, just to close the loop on this:

In the subquery were clauses like:

filter not exist {?s :p "str"}

Replacing these with a different negation form eliminated the problem.

I don't know if this is per spec or a quirk of jena query library.

Regards,
--Paul


On Jan 24, 2013, at 9:21, Paul Tyson <phtyson@sbcglobal.net> wrote:

> 
> 
> 
> 
> On Jan 24, 2013, at 8:45, Andy Seaborne <andy.seaborne@epimorphics.com> wrote:
> 
>> Joining the evaluation of {} with X yields X, not a single solution of null bindings.
> 
> Yes, I just learned that reading further in the spec. Now it is more of a puzzle. I will try to work up a simple example to replicate the problem.
> 
>> 
>> Is it a typo that:
>> 
>> [[
>> select distinct ?var1 ?var ?var3
>> ]]
> 
> Yes, typo. The select variables are identical in inner and outer queries.
> 
> Thanks,
> --Paul
>> 
>> has ?var not ?var2
>> 
>>   Andy
>> 
>> 
>> On 24/01/13 14:31, Paul Tyson wrote:
>>> 
>>> Lee,
>>> 
>>> 
>>> On Jan 24, 2013, at 0:36, Lee Feigenbaum <lee@thefigtrees.net
>>> <mailto:lee@thefigtrees.net>> wrote:
>>> 
>>>> Hi Paul,
>>>> 
>>>> Why would the outer query need any graph patterns other than the
>>>> subquery? You ought to be able to do exactly what you have below
>>>> without anything in the "what goes here" spot.
>>>> 
>>> That's what I thought at first, but it returns a single solution with no
>>> bindings. After studying the spec (SPARQL 1.1 section 12) I see this is
>>> probably as specified, because it joins the solution projected from the
>>> inner query to the solution from the outer query. The empty outer graph
>>> pattern returns a single solution of null bindings (per spec 5.2.1).
>>> 
>>> Regards,
>>> --Paul
>>> 
>>>> Lee
>>>> 
>>>> On 1/23/2013 3:56 PM, Paul Tyson wrote:
>>>>> Hi all,
>>>>> 
>>>>> I'm wondering if there is a simple solution to this problem.
>>>>> 
>>>>> I have a rather complicated query, consisting of several union
>>>>> clauses, which by its nature will return duplicates. I need to get a
>>>>> unique solution set so I can group them and sum a couple of fields.
>>>>> 
>>>>> Simply wrapping the union query in a nested SELECT DISTINCT doesn't
>>>>> work, because the outer query has no graph pattern to match the
>>>>> variables projected from the subquery.
>>>>> 
>>>>> I tried adding a series of BIND statements to simply rename the
>>>>> subquery variables for use by the aggregate outer query, but that
>>>>> didn't work (with jena, at least).
>>>>> 
>>>>> The source dataset is nearly 500M triples. I'm using Jena 2.7.3. The
>>>>> subquery will return anywhere from a few dozen to a few hundred
>>>>> solutions, and by itself runs very quickly.
>>>>> 
>>>>> Here's a skeleton view of the query. Is there something to fill "what
>>>>> goes here" that will pass the subquery results up to the grouping
>>>>> function?
>>>>> 
>>>>> select ?var1 ?var2 (sum(?var3) as ?var3_total)
>>>>> where {
>>>>> { ??? what goes here ??? }
>>>>> {select distinct ?var1 ?var ?var3
>>>>> where { ... complicated union query ... }}
>>>>> }
>>>>> group by ?var1 ?var2
>>>>> 
>>>>> Or any other suggestions on how to tackle this problem?
>>>>> 
>>>>> Thanks,
>>>>> --Paul
>>>>> 
>>>>> 
>>>> 
>> 
>

Received on Thursday, 24 January 2013 17:03:07 UTC