Property Path Cardinality Mixing from Andy Seaborne on 2012-03-18 (public-rdf-dawg@w3.org from January to March 2012)

From: Andy Seaborne <andy.seaborne@epimorphics.com>
Date: Sun, 18 Mar 2012 16:46:35 +0000
To: SPARQL Working Group <public-rdf-dawg@w3.org>
Message-ID: <4F66116B.8050404@epimorphics.com>

If we have some forms that are distinct and some counting, then we need 
an account for what happens when they are mixed.

The obvious (to me) way is that it simply follows the combination of 
operators.  Distinct operators only yield distinct results even if used 
over a path expression which may have duplicates;  Counting operators 
that combine distinct sub-operators may cause duplicates but don't 
change the sub-operators.

Support for these examples, assume | is not distinct and * is distinct:

Example 1:  (:a|:b)+ is distinct.
Example 2:  a+|b+  is not distinct.

Data:
:x1 :a :x2 .
:x1 :b :x2 .


{ :x1 :a+ ?X }
    ?X => ?X = :x2

{ :x1 :b+ ?X }
    ?X => ?X = :x2

Then { :x1 (:a|:b)+ ?X }
    ?X = :x2

But  { :x1 :a+|:b+ ?X }
      => { :x1 :b+ ?X } UNION { :x1 :b+ ?X }
      =>    ?X = :x2 and ?X = :x2

In implementation terms, it just works out by combination of subparts.
(An optimization includes a one bit flag to say if evaluating a form 
inside a context that will be distinct so the form only needs to be 
distinct and can otherwise be loose extra cardinality for efficiency.)

 Andy

Received on Sunday, 18 March 2012 16:47:07 UTC