W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > October to December 2009

Re: [TF-PP] Document now in CVS

From: Ivan Herman <ivan@w3.org>
Date: Tue, 24 Nov 2009 12:21:15 +0100
Message-ID: <4B0BC1AB.40907@w3.org>
To: Andy Seaborne <andy.seaborne@talis.com>
CC: SPARQL Working Group <public-rdf-dawg@w3.org>


Andy Seaborne wrote:
> Maybe it is - that's one of the things that needs working out.  I hjope
> I'm wrong but it looks to me like there is nothing different about this
> subcase but I haven't had the time to investigate in depth (hint, hint).
> 
> For example, what happens about multiple identical matches?  A BGP
> doesn't generate duplicates - if
> 
>   rdf:rest*/rdf:first
> 
> and
> 
>   rdf:rest*{?len}/rdf:first
> 
> are the same, just the first is a projection of the second to remove
> ?len, then there are duplicates.  Do we care?
> 
> I don't have any experience with a matcher that does not terminate early
> when looking for, say:
> 
>     ?list rdf:rest*/rdf:first "SomeFixedItem" .
> 
> but "SomeFixedItem" may be in the list twice.
> 
>> I am not too much worried about the multiple paths case: we can simply
>> declare that this case is undetermined, or we can say that it is
>> minimum/maximum of the paths. I believe for the use cases when there is
>> a need for the length (like the authors' list I referred to) the path is
>> unique.
> 
> What are the technical characteristics for this special case?  What is
> the defining feature that makes it a special situation and not an
> application of the general one? If we can articulate that, we may be
> able to define a solution for it.

A possible digression to justify what I write below.

If I have

SELECT ?a ?b
WHERE {
  { ?a ex:q ?b }
  UNION
  { ?a ex:p ?b }
}

with the data

:x ex:q :y.
:x ex:p :y.

then I will receive the pair :x and :y twice, right (unless I use
DISTINCT).

This means that, in fact and by analogy it would not shock me if we had
similar duplications if there are multiple possible pathes from ?a to ?b
with different lengths, and that length is explicitly asked for... So
talking about minimal or maximal length is unnecessary. If we have two
possible ways to get from ?a to ?b and ?len is asked for, then we will
have two possible solutions that differ in the value assigned to ?len.
If no such value is required, then we get only the match of ?a ?b.

I have the feeling looking at it this way simplifies things. So we can have

?a path*(?a) ?b.
?a path+(?a) ?b.
?a path{n,m}(?a) ?b.

where ?a is assigned the length of the path in case of match; if there
are different ways of getting from ?a to ?b and there is an explicit
length in the query request, then there are several solutions. (I have
changed the syntax to use '()' instead of '{}' to avoid a clash with the
{n,m} case)

You also had a slightly pathological case (if my understanding is
correct) on

WHERE {
  ?a path* ?b
  ?a path*(?len) ?b
}

I am not 100% sure of what to do there; my instinct would be that the
match with ?a and ?b without length qualifier is simply treated as a
separate solution (with ?len not assigned)...

Does this help?

ivan





-- 

Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf

Received on Tuesday, 24 November 2009 11:21:44 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:40 GMT