Re: Testing aggregate value

On 3 July 2013 16:37, Olivier Austina <olivier.austina@gmail.com> wrote:
> Hi,
> I vwant to write a query to find countries that langues more the Italy and
> Germany together. I try this query:
>
> PREFIX res: <http://dbpedia.org/resource/>
> SELECT  ?o
> WHERE{
> ?o    rdf:type <http://dbpedia.org/ontology/Country>;
>    dbpedia-owl:language  ?l1.
>   res:Germany    rdf:type <http://dbpedia.org/ontology/Country>;
>    dbpedia-owl:language  ?l2.
> res:Italy    rdf:type <http://dbpedia.org/ontology/Country>;
>    dbpedia-owl:language  ?l3.
> }
> GROUP BY ?o
> HAVING(count(?l1)> count(?l2)+count(?l3))

Assume your data has countries Germany (abbreviated DE), Italy (IT),
and Switzerland (CH) and that in DE they speak only German (G), in IT
they speak Italian (I) and German (G) and CH they speak G, I, and
French (F).

For your query the BGP is evaluated and only over the resulting
solutions the grouping and counting will happen. For the BGP you get a
lot of matches:
o    |  l1  |  l2  |  l3
---------------------------
DE |  G  |  G  |  I (country is DE where they speak G plus G and I for
DE and IT)
DE |  G  |  G  |  G (as above, but in IT they also speak G)
----------------------------
IT  |   I  |  G  |  I (county is IT where they speak I, plus G and I
for DE and IT)
IT  |   I  |  G  |  G (as above, but in IT they also speak G)
IT  |   G  |  G  |  I  ...
IT  |   G  |  G  |  G
----------------------------
CH | G   | G   |  I
CH | G   | G   |  G
CH |  I   | G   |  I
CH |  I   | G   |  G
CH | F   | G   |  I
CH | F   | G   |  G

The dashed lines already indicate the groups that are formed after BGP
matching according to your group by clause.

Now you also wanted to filter *within* the groups. Here you count how
many bindings you have for l1, l2 and l3 within the first, the second
and the third group. In the first group you have 2 bindings for l1, l2
and l3. In the second it's 4, in the third 6. The counting and the
comparison happens however per group. Hence you compare 2 > 2+2 for
the first group, 4 > 4+4 for the second and 6 > 6+6 for the third.
Each comparison evaluates to false and all results do not pass the
filter, hence no answer.

A solution that I see is to use subqueries:

SELECT ?c
WHERE {
  {
    SELECT (COUNT(?l_de) AS ?lang_count_de)
    WHERE {
        res:Germany a dbpedia-ont:Country; dbpedia-owl:language ?l_de.
    }
  {
    SELECT (COUNT(?l_it) AS ?lang_count_it)
    WHERE {
        res:Italy a dbpedia-ont:Country; dbpedia-owl:language ?l_it.
    }
  }
  ?c a dbpedia-ont:Country; dbpedia-owl:language  ?l.
}
GROUP BY ?c
HAVING(count(?l) > SAMPLE(?lang_count_de) + SAMPLE(?lang_count_it))

I didn't test the query though and maybe there are even better solutions.
You could also group by ?c, ?lang_count_de, ?lang_count_it (since the
values for the language counts are always the same), then you can
avoid using SAMPLE().

Birte

> But it doesn't work.Any suggestion is welcome. Thanks.
>
> Regards
> Olivier
>



-- 
Jun. Prof. Dr. Birte Glimm            Tel.:    +49 731 50 24125
Inst. of Artificial Intelligence         Secr:  +49 731 50 24258
University of Ulm                         Fax:   +49 731 50 24188
D-89069 Ulm                               birte.glimm@uni-ulm.de
Germany

Received on Wednesday, 3 July 2013 16:59:01 UTC