- From: Matthew Perry <matthew.perry@oracle.com>
- Date: Tue, 01 Feb 2011 09:54:08 -0500
- To: Andy Seaborne <andy.seaborne@epimorphics.com>
- CC: W3C SPARQL Working Group <public-rdf-dawg@w3.org>
Hi Andy, Thanks for the explanation. I would strongly prefer to modify the algorithm to use node marking instead of edge marking. I'll have to think about the other issue you brought up. - Matt On 2/1/2011 7:00 AM, Andy Seaborne wrote: > > > On 31/01/11 15:50, Matthew Perry wrote: >> Hi Andy, >> >> I follow your examples and agree with the results. Here is an excerpt >> from pp16 that I don't understand. >> >> Data: >> :d foaf:knows :e . >> :e foaf:knows :f . >> :f foaf:knows :e . >> :f foaf:name "test" . >> >> query pattern: >> { ?x foaf:knows+ ?y } >> >> Result Fragment: >> ?x=:d, ?y=:d >> ?x=:d, ?y=:e >> *?x=:d, ?y=:e* >> ?x=:d, ?y=:f >> >> I don't understand where the second ?x=:d, ?y=:e comes from. I assume >> it's from the path >> :d -- foaf:knows -> :e -- foaf:knows -> :f -- foaf:knows -> :e >> but it seems like this path should be excluded because of a cycle. > > > > > As it is currently defined, we get the "?x=:d, ?y=:e" because it records triples traversed (as per Bite's suggestion in the telecon, we don't have good minutes though). So, yes, it gets "?x=:d, ?y=:e" from :d--foaf:knows-->:e then one from the cycle. > > This is related to the two-loop example I gave, which I think does need duplicates - here it's no cycles and one cycle. > > I'm not sure whether this will work but if the definitional algorithm records points-along-a-path then it can check for duplicates such as this one, but not exclude the 2-cycle example. Needs checking - it depends on whether the path starts inside the cycle or enters it (as in your example) as part of processing. > > Also, there are some potential intuitive invariants we may wish to consider in refining the spec: > > We already have: > > { ?x foaf:knows* ?y } > == > { ?x foaf:knows{0} ?y } UNION {?x foaf:knows+ ?y } > > but what about: > > { ?x foaf:knows+ ?y } > =?= > { ?x foaf:knows/foaf:knows* ?y } > > Andy > >> >> Thanks, >> Matt >> >> On 1/29/2011 11:59 AM, Andy Seaborne wrote: >>> >>> >>> On 25/01/11 17:56, Matthew Perry wrote: >>>> Hi Andy, >>>> >>>> Thanks for the explanation. I completely understand and agree with the >>>> cycles in fixed-length property paths. >>>> >>>> My concern is with the arbitrary-length paths. I would strongly prefer >>>> to have a purely node-marking algorithm instead of triple marking. The >>>> pure node marking seems to better match the resolution we passed back in >>>> June: http://www.w3.org/2009/sparql/meeting/2010-06-01#resolution_5. >>>> (The two solutions with cycles both repeat nodes). >>>> >>>> PROPOSED: The cardinality of solutions to fixed-length paths >>>> is the same as the cardinality of solutions to the path expanded into >>>> triple patterns (with all variables projected);*the cardinality of >>>> solutions to variable-length paths is the cardinality of solutions >>>> via paths that do not repeat nodes;* the cardinality of solutions to >>>> paths combining fixed and variable length (elt{n,} ) is a combination >>>> of the fixed definition plus the variable definition for paths longer >>>> than the fixed length. >>>> >>>> Thanks, >>>> Matt >>> >>> Matt, >>> >>> :p* is defined to be :p{0} UNION :p+ >>> >>> The duplicate comes from the two parts >>> >>> { :a :p* ?x } >>> = >>> { :a :p{0} ?x } UNION { :a :p+ ?x } >>> >>> Data: >>> :a :p :b . >>> :b :p :c . >>> :c :p :a . >>> >>> >>> So the composition of patterns means that >>> >>> { :a :p{0} ?x } => ?x=:a >>> { :a :p+ ?x } => ?x = :b, :c, :a >>> >>> ?x=:a occurs because of the two parts. I don't think it directly >>> violates the part you quote. The algorithm is to put the endpoint of >>> triple in the results, as the edge is traversed. >>> >>> This could be changed in the spec: define an operator specifically for >>> :p*, not making it the multi-set-union of other forms. >>> >>> An effect of this is that same results come from: >>> >>> :a :p :b . >>> :b :p :c . >>> >>> i.e. cycle and a non-cycle with removal of a triple for the last hop, >>> are the same. >>> >>> I have no strong opinion either way - I'd like to see some more use >>> cases. >>> >>> - - - - - - - - >>> >>> A separate issue is when there are multiple paths/loops in the graph. >>> If there are two loops, then :p+ leads to duplicates. >>> >>> { :a :p+ ?X } >>> >>> :a :p :b . >>> :b :p :c . >>> :c :p :a . >>> >>> :a :p :x . >>> :x :p :y . >>> :y :p :a . >>> >>> ------ >>> | X | >>> ====== >>> | :x | >>> | :y | >>> | :a | >>> | :b | >>> | :c | >>> | :a | >>> ------ >>> >>> As with the comment from Jorge, we have to be careful about use with >>> aggregates and literals. This is not illustrated by this example >>> unless literals can be subjects. Ideally, I'd like to not design-out >>> literals-as-subjects as RDF-MT already allows it, even if the syntaxes >>> do not. >>> >>> Andy >>> >>> >> >
Received on Tuesday, 1 February 2011 14:55:21 UTC