Re: Action 369 -- Look at property path tests from Matthew Perry on 2011-01-31 (public-rdf-dawg@w3.org from January to March 2011)

From: Matthew Perry <matthew.perry@oracle.com>
Date: Mon, 31 Jan 2011 10:50:57 -0500
To: Andy Seaborne <andy.seaborne@epimorphics.com>
CC: W3C SPARQL Working Group <public-rdf-dawg@w3.org>
Message-ID: <4D46DA61.6010309@oracle.com>

Hi Andy,

I follow your examples and agree with the results. Here is an excerpt from pp16 that I don't understand.

Data:
:d foaf:knows :e .
:e foaf:knows :f .
:f foaf:knows :e .
:f foaf:name "test" .

query pattern:
{ ?x foaf:knows+ ?y }

Result Fragment:
?x=:d, ?y=:d
?x=:d, ?y=:e
*?x=:d, ?y=:e*
?x=:d, ?y=:f

I don't understand where the second ?x=:d, ?y=:e comes from. I assume it's from the path
:d -- foaf:knows -> :e -- foaf:knows -> :f --  foaf:knows -> :e
but it seems like this path should be excluded because of a cycle.

Thanks,
Matt

On 1/29/2011 11:59 AM, Andy Seaborne wrote:
>
>
> On 25/01/11 17:56, Matthew Perry wrote:
>> Hi Andy,
>>
>> Thanks for the explanation. I completely understand and agree with the
>> cycles in fixed-length property paths.
>>
>> My concern is with the arbitrary-length paths. I would strongly prefer
>> to have a purely node-marking algorithm instead of triple marking. The
>> pure node marking seems to better match the resolution we passed back in
>> June: http://www.w3.org/2009/sparql/meeting/2010-06-01#resolution_5.
>> (The two solutions with cycles both repeat nodes).
>>
>> PROPOSED: The cardinality of solutions to fixed-length paths
>> is the same as the cardinality of solutions to the path expanded into
>> triple patterns (with all variables projected);*the cardinality of
>> solutions to variable-length paths is the cardinality of solutions
>> via paths that do not repeat nodes;*  the cardinality of solutions to
>> paths combining fixed and variable length (elt{n,} ) is a combination
>> of the fixed definition plus the variable definition for paths longer
>> than the fixed length.
>>
>> Thanks,
>> Matt
>
> Matt,
>
> :p* is defined to be :p{0} UNION :p+
>
> The duplicate comes from the two parts
>
> { :a :p* ?x }
> =
> { :a :p{0} ?x } UNION { :a :p+ ?x }
>
> Data:
> :a :p :b .
> :b :p :c .
> :c :p :a .
>
>
> So the composition of patterns means that
>
> { :a :p{0} ?x } => ?x=:a
> { :a :p+ ?x } => ?x = :b, :c, :a
>
> ?x=:a occurs because of the two parts.  I don't think it directly violates the part you quote.  The algorithm is to put the endpoint of triple in the results, as the edge is traversed.
>
> This could be changed in the spec: define an operator specifically for :p*, not making it the multi-set-union of other forms.
>
> An effect of this is that same results come from:
>
> :a :p :b .
> :b :p :c .
>
> i.e. cycle and a non-cycle with removal of a triple for the last hop, are the same.
>
> I have no strong opinion either way - I'd like to see some more use cases.
>
> - - - - - - - -
>
> A separate issue is when there are multiple paths/loops in the graph.
> If there are two loops, then :p+ leads to duplicates.
>
> { :a :p+ ?X }
>
> :a :p :b .
> :b :p :c .
> :c :p :a .
>
> :a :p :x .
> :x :p :y .
> :y :p :a .
>
> ------
> | X  |
> ======
> | :x |
> | :y |
> | :a |
> | :b |
> | :c |
> | :a |
> ------
>
> As with the comment from Jorge, we have to be careful about use with aggregates and literals.  This is not illustrated by this example unless literals can be subjects.  Ideally, I'd like to not design-out literals-as-subjects as RDF-MT already allows it, even if the syntaxes do not.
>
>     Andy
>
>

Received on Monday, 31 January 2011 15:52:26 UTC