Re: Comments about the semantics of property paths

Hello Andy,

Thank you very much for your response and for considering my comments,
and sorry for the late reply.

There is a couple of comments that you have not answered.

""
As a separate but very important issue, notice that the XPath language
does not consider duplicate paths when evaluating expressions (XPath
is evaluated in the "there exists" way that I mentioned before). Thus,
counting paths in SPARQL would be somewhat in contradiction with
previously proposed path languages considered by the W3C.
""

I think that if this W3C Recommendation is in discordance with a
previous Recommendation about a similar topic, then DAWG should have
strong reasons for that, and make them clear in the specification. The
specification should also advice the reader about this issue.

Besides that comment, you have said nothing about efficiency of
evaluation. Notice that this not related to a particular way of
implementing the language. It is about the huge efficiency impact that
any implementation will suffer in practice. You have not acknowledge
that in your response. Have you consider this as an issue?

Another comment that is not covered by your response is whether there
exists a use case that demand counting different paths. In your
response, it seems that the reason for counting paths is to make
easier the job of the implementors (by reusing algebra operators).
Opposite to what the group think, I think that not counting paths
gives the implementor more freedom since paths could be implemented in
several different ways, being just one of them by reusing algebra
operators. Can you please clarify whether there are use cases about
this? This would help a lot.

If you respond to the comments above I can consider my comments answered.

I have a couple of additional words. Please do not consider them as a
formal objection to the process, but just as my opinion.

I still strongly disagree with your design decisions about property
paths. In particular, I insist that it is a mistake to define the
semantics in the presence of cycles in a non-standard way and by
forcing a particular algorithm to evaluate them. In your response you
say that there can be corner cases, but it is not only a problem of
corner cases. From my point of view it will become a problem of
adoption of the standard. In this point I think that the group should
not neglect that there is a lot of related (theoretical and practical)
work in this area that have handled cycles in a completely different
way.

To conclude, I do think that the property-paths material in the
current specification is far from being mature. Considering that the
group is in a tight schedule, I think that it would be better to not
include property paths in this round of standardization, than
including them in their current form.

Thank you very much for considering my comments.
- jorge

On Thu, Dec 2, 2010 at 8:20 AM, Andy Seaborne
<andy.seaborne@epimorphics.com> wrote:
> Jorge,
>
> Thank very much for your comments.
>
> The working group considered a number of factors in designing the property
> path features. In addition to the points you raise, the WG also included
> consideration that, while this working group is not adding a path datatype
> (needed to inquire about any path matched later in the query), nor the
> specific case of access to path length, the WG should leave open as many
> possibilities here for future work. Another factor in the design is the
> relationship of some property path expressions to triple pattern forms.
>
> Although not specifying returning the path length of a match, nor specifying
> returning the matched path itself, the WG felt that, on balance, the design
> in the working draft gave maximum scope for any later standardization work.
> The issue of path length particularly was considered as a feature for this
> round of work but, when considered against all the other work items the WG
> has taken on, it didn't make the final list of work items. This lead to the
> conclusion that counting path possibilities, not a "there exists" condition,
> was the better choice for this round of standardization. Adding access the
> the path matched is better served if all paths are considered.
>
> Another consideration was the relationship of property paths and existing
> queries using triple patterns.
>
> { ?x :p{2} ?y }
>
> and
>
> { ?x :p ?Z . ?Z :p ?y }, with ?Z projected away.
>
> The WG decided to make these equivalent, including in terms of numbers of
> solutions. This gives the semantics of many path forms in terms of SPARQL
> graph pattern operators. This was felt to be intuitive and to utilize the
> capabilities of query engines: rather that requiring yet another mechanism,
> the equivalence means that join-technology (for example) can be used to
> solve the pattern.
>
> This then leaves the issue of cycles in the "+" operator. The design is one
> in which the cycles in "+" operator are handled by traversing a directed
> edge (triple in the data) once. This will be explained in the final version
> of the query specification - there is a placeholder for it in the current
> editors working draft. The current working draft has been clarified to use
> "multiset-union" for the union in the ArbitraryLengthPath definition.
>
> This overall design is a tradeoff of implementation, future possibilities,
> and equivalence of patterns on graphs. The WG is aware that there can be
> corner cases can arise where different intuitions are not compatible. On
> balance, the WG feels that the current design is most suitable for this
> round of standardization.
>
> Again, that you for your helpful comments.
>
> We would be grateful if you would acknowledge that your comment has been
> answered by sending a reply to this mailing list.
>
> Andy
> on behalf of the SPARQL Working Group.
>

Received on Wednesday, 15 December 2010 18:35:36 UTC