W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > October to December 2009

Re: [TF-PP] Document now in CVS

From: Lee Feigenbaum <lee@thefigtrees.net>
Date: Tue, 24 Nov 2009 09:24:00 -0500
Message-ID: <4B0BEC80.1010505@thefigtrees.net>
To: Ivan Herman <ivan@w3.org>
CC: Andy Seaborne <andy.seaborne@talis.com>, SPARQL Working Group <public-rdf-dawg@w3.org>
Ivan Herman wrote:
> 
> Andy Seaborne wrote:
>> Maybe it is - that's one of the things that needs working out.  I hjope
>> I'm wrong but it looks to me like there is nothing different about this
>> subcase but I haven't had the time to investigate in depth (hint, hint).
>>
>> For example, what happens about multiple identical matches?  A BGP
>> doesn't generate duplicates - if
>>
>>   rdf:rest*/rdf:first
>>
>> and
>>
>>   rdf:rest*{?len}/rdf:first
>>
>> are the same, just the first is a projection of the second to remove
>> ?len, then there are duplicates.  Do we care?
>>
>> I don't have any experience with a matcher that does not terminate early
>> when looking for, say:
>>
>>     ?list rdf:rest*/rdf:first "SomeFixedItem" .
>>
>> but "SomeFixedItem" may be in the list twice.
>>
>>> I am not too much worried about the multiple paths case: we can simply
>>> declare that this case is undetermined, or we can say that it is
>>> minimum/maximum of the paths. I believe for the use cases when there is
>>> a need for the length (like the authors' list I referred to) the path is
>>> unique.
>> What are the technical characteristics for this special case?  What is
>> the defining feature that makes it a special situation and not an
>> application of the general one? If we can articulate that, we may be
>> able to define a solution for it.
> 
> A possible digression to justify what I write below.
> 
> If I have
> 
> SELECT ?a ?b
> WHERE {
>   { ?a ex:q ?b }
>   UNION
>   { ?a ex:p ?b }
> }
> 
> with the data
> 
> :x ex:q :y.
> :x ex:p :y.
> 
> then I will receive the pair :x and :y twice, right (unless I use
> DISTINCT).
> 
> This means that, in fact and by analogy it would not shock me if we had
> similar duplications if there are multiple possible pathes from ?a to ?b
> with different lengths, and that length is explicitly asked for... So
> talking about minimal or maximal length is unnecessary. If we have two
> possible ways to get from ?a to ?b and ?len is asked for, then we will
> have two possible solutions that differ in the value assigned to ?len.
> If no such value is required, then we get only the match of ?a ?b.

Actually, I think this complexity (returning all paths between two 
nodes) is exactly why much of the group is reluctant to pursue variable 
paths and path lengths. At least, it's definitely why I don't want to 
pursue that at this point.

Lee

> I have the feeling looking at it this way simplifies things. So we can have
> 
> ?a path*(?a) ?b.
> ?a path+(?a) ?b.
> ?a path{n,m}(?a) ?b.
> 
> where ?a is assigned the length of the path in case of match; if there
> are different ways of getting from ?a to ?b and there is an explicit
> length in the query request, then there are several solutions. (I have
> changed the syntax to use '()' instead of '{}' to avoid a clash with the
> {n,m} case)
> 
> You also had a slightly pathological case (if my understanding is
> correct) on
> 
> WHERE {
>   ?a path* ?b
>   ?a path*(?len) ?b
> }
> 
> I am not 100% sure of what to do there; my instinct would be that the
> match with ?a and ?b without length qualifier is simply treated as a
> separate solution (with ?len not assigned)...
> 
> Does this help?
> 
> ivan
> 
> 
> 
> 
> 
Received on Tuesday, 24 November 2009 14:24:33 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:40 GMT