From: Andy Seaborne <andy.seaborne@epimorphics.com>

Date: Mon, 27 Feb 2012 22:51:19 +0000

Message-ID: <4F4C08E7.9080809@epimorphics.com>

To: Axel Polleres <axel.polleres@deri.org>

CC: SPARQL Working Group <public-rdf-dawg@w3.org>

Date: Mon, 27 Feb 2012 22:51:19 +0000

Message-ID: <4F4C08E7.9080809@epimorphics.com>

To: Axel Polleres <axel.polleres@deri.org>

CC: SPARQL Working Group <public-rdf-dawg@w3.org>

On 27/02/12 22:16, Axel Polleres wrote: > Hi Andy, > > Two questions for clarification: > >> A minimal change is a pair of new operators {*} and {+} which are the >> analogous to * and +. >> >> {*} and {+} are counting >> * and + are not (a change to 2LC). > > > 1) So, would e.g. :p{*}/:q* mix non-counting and counting? Yes. (as would :p*/DISTINCT(:q*) and what is, for all proposals, DISTINCT(path) can be still lead to apparent duplicates because of projection elsewhere in the graph pattern. > 2) Is this meant on top or alternatively to DISTINCT() ? An alternative to DISTINCT(path) (but doing both at possible.) A practical issue in just DISTINCT() is need define evaluation for each operator, even ones that the transformation would otherwise have handled. I am no convinced simply saying "make distinct" is helpful as people seem to rely on the algorithms in the spec when they really only define correct behaviour. Andy > best, > Axel > > On 27 Feb 2012, at 22:49, Andy Seaborne wrote: > >> 1/ There are two use cases for arbitrary length paths >> UC1: Finding or testing for connectivity of nodes: e.g. rdfs:subClassOf* >> UC2: Coping with variable structures and aggregation >> >> >> 2/ Property paths can be expensive, notably in UC2. >> >> 3/ We want to leave open the possibility of path lengths >> >> (foaf:knows* is not a clear cut example - your LinkedIn graph is not >> simple connectivity) >> >> 4/ PP transforms are used to define property path evaluation >> leaving only ZeroLengthPath, ZeroOrMorePath, OneOrMorePath, and >> NegatedPropertySet to be defined in the evaluation semantics. >> >> ----- >> >> DISTINCT is one way forward. >> >> ?x DISTINCT(path) ?y . >> >> is the pairs of (?x,?y) connected by path. >> >> ?x DISTINCT(path*) ?y is connectivity. >> ?x path* ?y is counting as per current spec. >> >> ----- >> >> A minimal change is a pair of new operators {*} and {+} which are the >> analogous to * and +. >> >> {*} and {+} are counting >> * and + are not (a change to 2LC). >> >> (This is based on guessing {?length} is a future path length syntax - >> see also discussion of {N,M} >> >> This is less than ?x DISTINCT(path) ?y >> It can be "as well as". >> >> Roughly: the distinct * and + are achieved by changing ALP to remove the >> last line "remove x from V". We need to check the edge cases still. >> >> It's this state management operation that controls whether the test for >> termination is along the current path (counting) or whether the function >> has visited the node at all. >> >> ----- >> For {*}: >> >> {N,M} is untouched. >> {0,*} is defined to be equivalent to {*} (i.e. still counting) >> >> {N,M} can be very expensive (think {0,N} in a clique of N nodes) but it >> is syntax for ways to write the pattern as it would be in SPARQL 1.0 >> >> ?X:p{1,M} ?Y is >> >> ?X :p ?Y . >> UNION >> ?x :p ?a1 . ?a1 :p Y >> UNION >> ?x :p ?a1 . ?a1 :p ?a2 . ?a2 :p ?Y >> ... >> ?x :p ?a1 . ?a1 :p ?a2 . ?a(M-1) :p ?Y >> >> Andy >> >> >Received on Monday, 27 February 2012 22:51:44 GMT

*
This archive was generated by hypermail 2.3.1
: Tuesday, 26 March 2013 16:15:47 GMT
*