W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > January to March 2011

Re: Action 369 -- Look at property path tests

From: Matthew Perry <matthew.perry@oracle.com>
Date: Tue, 01 Feb 2011 09:54:08 -0500
Message-ID: <4D481E90.4070308@oracle.com>
To: Andy Seaborne <andy.seaborne@epimorphics.com>
CC: W3C SPARQL Working Group <public-rdf-dawg@w3.org>
Hi Andy,

Thanks for the explanation. I would strongly prefer to modify the algorithm to use node marking instead of edge marking. I'll have to think about the other issue you brought up.

- Matt

On 2/1/2011 7:00 AM, Andy Seaborne wrote:
>
>
> On 31/01/11 15:50, Matthew Perry wrote:
>> Hi Andy,
>>
>> I follow your examples and agree with the results. Here is an excerpt
>> from pp16 that I don't understand.
>>
>> Data:
>> :d foaf:knows :e .
>> :e foaf:knows :f .
>> :f foaf:knows :e .
>> :f foaf:name "test" .
>>
>> query pattern:
>> { ?x foaf:knows+ ?y }
>>
>> Result Fragment:
>> ?x=:d, ?y=:d
>> ?x=:d, ?y=:e
>> *?x=:d, ?y=:e*
>> ?x=:d, ?y=:f
>>
>> I don't understand where the second ?x=:d, ?y=:e comes from. I assume
>> it's from the path
>> :d -- foaf:knows -> :e -- foaf:knows -> :f -- foaf:knows -> :e
>> but it seems like this path should be excluded because of a cycle.
>
>
>
>
> As it is currently defined, we get the "?x=:d, ?y=:e" because it records triples traversed (as per Bite's suggestion in the telecon, we don't have good minutes though).  So, yes, it gets "?x=:d, ?y=:e" from :d--foaf:knows-->:e then one from the cycle.
>
> This is related to the two-loop example I gave, which I think does need duplicates - here it's no cycles and one cycle.
>
> I'm not sure whether this will work but if the definitional algorithm records points-along-a-path then it can check for duplicates such as this one, but not exclude the 2-cycle example.  Needs checking - it depends on whether the path starts inside the cycle or enters it (as in your example) as part of processing.
>
> Also, there are some potential intuitive invariants we may wish to consider in refining the spec:
>
> We already have:
>
> { ?x foaf:knows* ?y }
> ==
> { ?x foaf:knows{0} ?y } UNION {?x foaf:knows+ ?y }
>
> but what about:
>
> { ?x foaf:knows+ ?y }
> =?=
> { ?x foaf:knows/foaf:knows* ?y }
>
>     Andy
>
>>
>> Thanks,
>> Matt
>>
>> On 1/29/2011 11:59 AM, Andy Seaborne wrote:
>>>
>>>
>>> On 25/01/11 17:56, Matthew Perry wrote:
>>>> Hi Andy,
>>>>
>>>> Thanks for the explanation. I completely understand and agree with the
>>>> cycles in fixed-length property paths.
>>>>
>>>> My concern is with the arbitrary-length paths. I would strongly prefer
>>>> to have a purely node-marking algorithm instead of triple marking. The
>>>> pure node marking seems to better match the resolution we passed back in
>>>> June: http://www.w3.org/2009/sparql/meeting/2010-06-01#resolution_5.
>>>> (The two solutions with cycles both repeat nodes).
>>>>
>>>> PROPOSED: The cardinality of solutions to fixed-length paths
>>>> is the same as the cardinality of solutions to the path expanded into
>>>> triple patterns (with all variables projected);*the cardinality of
>>>> solutions to variable-length paths is the cardinality of solutions
>>>> via paths that do not repeat nodes;* the cardinality of solutions to
>>>> paths combining fixed and variable length (elt{n,} ) is a combination
>>>> of the fixed definition plus the variable definition for paths longer
>>>> than the fixed length.
>>>>
>>>> Thanks,
>>>> Matt
>>>
>>> Matt,
>>>
>>> :p* is defined to be :p{0} UNION :p+
>>>
>>> The duplicate comes from the two parts
>>>
>>> { :a :p* ?x }
>>> =
>>> { :a :p{0} ?x } UNION { :a :p+ ?x }
>>>
>>> Data:
>>> :a :p :b .
>>> :b :p :c .
>>> :c :p :a .
>>>
>>>
>>> So the composition of patterns means that
>>>
>>> { :a :p{0} ?x } => ?x=:a
>>> { :a :p+ ?x } => ?x = :b, :c, :a
>>>
>>> ?x=:a occurs because of the two parts. I don't think it directly
>>> violates the part you quote. The algorithm is to put the endpoint of
>>> triple in the results, as the edge is traversed.
>>>
>>> This could be changed in the spec: define an operator specifically for
>>> :p*, not making it the multi-set-union of other forms.
>>>
>>> An effect of this is that same results come from:
>>>
>>> :a :p :b .
>>> :b :p :c .
>>>
>>> i.e. cycle and a non-cycle with removal of a triple for the last hop,
>>> are the same.
>>>
>>> I have no strong opinion either way - I'd like to see some more use
>>> cases.
>>>
>>> - - - - - - - -
>>>
>>> A separate issue is when there are multiple paths/loops in the graph.
>>> If there are two loops, then :p+ leads to duplicates.
>>>
>>> { :a :p+ ?X }
>>>
>>> :a :p :b .
>>> :b :p :c .
>>> :c :p :a .
>>>
>>> :a :p :x .
>>> :x :p :y .
>>> :y :p :a .
>>>
>>> ------
>>> | X |
>>> ======
>>> | :x |
>>> | :y |
>>> | :a |
>>> | :b |
>>> | :c |
>>> | :a |
>>> ------
>>>
>>> As with the comment from Jorge, we have to be careful about use with
>>> aggregates and literals. This is not illustrated by this example
>>> unless literals can be subjects. Ideally, I'd like to not design-out
>>> literals-as-subjects as RDF-MT already allows it, even if the syntaxes
>>> do not.
>>>
>>> Andy
>>>
>>>
>>
>
Received on Tuesday, 1 February 2011 14:55:21 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:45 GMT