Re: SPARQL WG action on property paths from Lee Feigenbaum on 2012-04-06 (www-archive@w3.org from April 2012)

From: Lee Feigenbaum <lee@thefigtrees.net>
Date: Fri, 06 Apr 2012 06:35:33 -0400
To: jorge perez <jorge.perez.rojas@gmail.com>
CC: W Martens <martens.wim@gmail.com>, jeen.broekstra@gmail.com, Marcelo Arenas <marcelo.arenas1@gmail.com>, Sebastián Conca <sconca87@gmail.com>, www-archive@w3.org, Axel Polleres <axel@polleres.net>
Message-ID: <4F7EC6F5.3030104@thefigtrees.net>
Jorge,

Can you please let me know if you believe that the current consensus 
design addresses your concerns about the evaluation performance 
challenges of the Last Call design that you raised in your original 
message to the -comments list?

While I understand that there are a variety of considerations that 
relate to the particular design and that you may not be in favor of that 
design, I'm not in a position currently to have that discussion. If the 
group proceeds with this design direction, there will be--as 
always--opportunity to comment on it, and if you disagree with the 
outcome--to object to the specification's advancement towards W3C 
Recommendation. For the purposes of this email, I'm simply trying to 
understand if this design addresses the original concerns with 
performance evaluation of property paths.

thank you,
Lee

On 4/6/2012 1:30 AM, jorge perez wrote:
> On Tue, Apr 3, 2012 at 11:19 AM, Lee Feigenbaum<lee@thefigtrees.net>  wrote:
>> Hi Wim, Jorge, Jeen, Marcelo, and Sebastian,
>>
>> (Please note that this is not an official working group response to your
>> respective comments on property paths in the current SPARQL 1.1 Query last
>> call working draft.)
>>
>> I want to thank you all again for your research, experiences, suggestions,
>> and comments on SPARQL 1.1 property paths. They've been very valuable to the
>> working group.
>>
>> The group has spent some time in the past few weeks considering various
>> options in an attempt to address the implementation and evaluation
>> challenges that you have all raised while still respecting our group's
>> schedule, implementers' burdens, and the use cases we've identified for
>> property paths.
>>
>> Today, we reached consensus within the group on an approach that we feel
>> addresses your concerns while still leaving room for implementation
>> experience going forward to inform additional design decisions in the
>> future.
>>
>> We haven't yet worked this design into the query document, which is why this
>> isn't an official WG response to your comments. Yet before we go ahead and
>> publish a new Last call, we'd like to know if you support this new design
>> and if you believe that it does indeed address your comments.
>>
>> The design is summarized in these two emails by Andy Seaborne:
>>
>> http://lists.w3.org/Archives/Public/public-rdf-dawg/2012JanMar/0285.html
>> http://lists.w3.org/Archives/Public/public-rdf-dawg/2012JanMar/0286.html
>>
>> I'd very much appreciate it if you can take a look at this and let me know
>> what you think.
>
> Hi Lee,
>
> I have followed the discussion regarding property paths in detail for
> more than one year, including the two links mentioned above. Regarding
> what I think about this last proposal, I think that it is not a good
> design decision. Making some property-path operators counting and some
> others not is just not natural. From my point of view, it would be
> really difficult to tell the users what is exactly going-on with the
> semantics. Thus, I am not Ok with this new proposal. Personally, I
> still do not understand the need for counting at all. If I can be
> honest with you, I cannot see any really strong use case for making
> counting a default (and, moreover, Marcelo in his previous email
> showed that all the use cases proposed so far can be more naturally
> expressed with an existential semantics plus ordinary SPARQL
> operators). As far as I can see, having a counting semantics for
> property paths was just "an accident" when the group decided to define
> property paths by translating them into SPARQL 1.0 operators. At that
> time the group did not have enough information to make a clear choice.
>
> On the other hand, and as opposed as what I think it has been said in
> some discussions, there is a lot implementation experience in
> different contexts on path queries with existential semantics, and
> also a huge amount of research. Even there is implementation
> experience in SPARQL. Please see:
>
> Gleen: http://sig.biostr.washington.edu/projects/ontviews/gleen/
> PSPARQL: http://exmo.inrialpes.fr/software/psparql/
> RPL: http://rpl.pms.ifi.lmu.de/
>
> The three of them implement path queries with existential semantics
> (non counting), and they work great!
>
> In contrast, there is no experience on implementing path queries that
> count, and current implementations of SPARQL 1.1 spec give different
> results for the same queries (see
> http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2012Feb/0006.html).
> This shows that a counting semantics is difficult to understand even
> for experienced developers. Moreover, this topic is still an open
> research question. Please notice that the two papers that we have made
> public to the group are going to be presented in two of the most
> important conferences on the subjects of Web (WWW 2012) and databases
> (PODS 2012), and are only the first efforts in trying to understand
> the issue.
>
> On the positive side, and only if the group insists in the need for
> counting for some property path operators, I personally like more the
> proposal of DISTINCT/ALL over path expressions (that was also in the
> mailing list), but only if DISTINCT is the default. Please notice that
> this kind of design is not really different to some SQL operators.
> Just recall the "UNION ALL" in SQL. The rationale is that UNION is
> essentially a "set" operator, and this is the natural way to be
> defined. Thus if you want to retain duplicates in a SQL UNION query,
> an additional keyword should be provided. My personal view here is
> that for path queries it should be similar: the natural semantics
> (used for years in graph databases, XML and also in the RDF/SPARQL
> context) is an existential semantics (no duplicates), thus if you want
> to retain duplicates (in whatever form the group decide to count
> duplicates) you should provide an additional keyword such as ALL.
>
> Please let me know if it is OK with you if I forward this response
> together with your message to the public-rdf-dawg-comments list (I
> think we can attract more commenters and opinions to this subject if
> we openly discuss it).
>
> Cheers,
> - jorge
>
>
>>
>> thanks,
>> Lee
>
Received on Friday, 6 April 2012 10:36:08 UTC