Re: Comments about property paths from Sebastián Conca on 2011-07-05 (public-rdf-dawg-comments@w3.org from July 2011)

From: Sebastián Conca <sconca87@gmail.com>
Date: Tue, 5 Jul 2011 12:01:03 -0400
To: Andy Seaborne <andy.seaborne@epimorphics.com>
Cc: public-rdf-dawg-comments@w3.org
Message-ID: <CAOMAg+MXF+KEND5JMb6DmOt_1ghmkVii3M4cywSCjegJA7J1ng@mail.gmail.com>
Dear Andy,

I read the discussion about my last email on the
public-rdf-dawg@w3.org mailing list, and I discovered that you said
that in my messages there weren't any proposal. I am sorry about this,
I didn't realize that the group was expecting a proposal from me. I
think that the current design of property paths can be improved, and I
was only looking  for some motivation or use cases that support the
current semantics of property paths proposed by the group, so I am
sorry for my critical attitude.

If you want to know my opinion, I think that the best way to solve
this problem is by using an "existential" semantic, that is, a query
like ?x p ?y, where p is a property path, should return the pairs
(a,b) of resources such that there exists a path from a to b whose
labels are in the regular language accepted by p. In the previous
emails, you told me that this is the same that using the DISCTINT
operator, but the difference is in the complexity, because counting
the number of paths between two nodes is very hard. I think that if
counting is an important feature, then the language could include an
operator to explicitly say that one wants to counts the number of
paths from a to b that conforms to a particular regular expression
(something like ?x COUNT(p) ?y, where p is a property path). But my
guess is that most of the time (if not all the time), people would
only want to use the faster existential semantic, instead of counting
the number of paths (as it is done in XPath, XQuery and many graph
query languages).

With best regards,

Sebastián Conca

El 21 de junio de 2011 11:32, Andy Seaborne
<andy.seaborne@epimorphics.com>escribió:

> > I still don't understand why the
> > semantic of property paths is defined allowing different results for
> > equivalent regular expressions.
>
> Sebastián,
>
> In the previous reply, we pointed out that equivalences you describe are
> not part of the property path specification. The working group has reviewed
> your comments and has decided that the design in the published documents is
> the most appropriate design.
>
> That design of property paths is a balance and needs to integrate in with
> the overall evaluation of a SPARQL query. I would stress that it is a
> balance - there are competing design dimensions that need to be resolved, in
> particular, fitting in with expectations driven by reading a property path
> expression as a short form of paths that can be written out in full without
> property paths. The working group feels that the expansion approach reduces
> implementation burden, exploiting implementation techniques used in SPARQL
> 1.0, and provides convenient shorter forms for BGPs that could be written
> out in full.
>
>
> > But I couldn't find any use case that motivates the semantics of
> > "Property Paths" proposed in the SPARQL 1.1 draft document. In fact,
> > in the second document there are only 3 simple examples that justify
> > neither the need to define the semantics of "Property Paths" as it is
> > done in the SPARQL 1.1 draft document, nor the need to have repeated
> > mappings on the query result. Is there another document where I can
> > find more detailed examples or use cases?
>
> The "SPARQL New Features and Rationale" document is not a complete design
> for property paths; that is not the purpose of the document. The working
> group created the design through discussions within the working group as
> well as experience from existing systems. This is documented in the email
> archive for the working group, including discussions on test cases,
> especially around the operators for * and +.
>
>
> We would be grateful if you would acknowledge that your comment has been
> answered by sending a reply to this mailing list.
>
> Andy, on behalf of the working group.
>
>
> On 06/06/11 22:16, Sebastián Conca wrote:
>
>> Dear Andy,
>>
>> Thank you again for your reply. I still don't understand why the
>> semantic of property paths is defined allowing different results for
>> equivalent regular expressions. I have been trying to understand for
>> which applications this semantic is needed. So, I looked for the
>> SPARQL 1.1 use cases, and I found the following documents:
>>
>> RDF Data Access Use Cases and Requirements:
>> http://www.w3.org/TR/rdf-dawg-**uc/ <http://www.w3.org/TR/rdf-dawg-uc/>
>> SPARQL New Features and Rationale: http://www.w3.org/TR/sparql-**
>> features/ <http://www.w3.org/TR/sparql-features/>
>>
>> But I couldn't find any use case that motivates the semantics of
>> "Property Paths" proposed in the SPARQL 1.1 draft document. In fact,
>> in the second document there are only 3 simple examples that justify
>> neither the need to define the semantics of "Property Paths" as it is
>> done in the SPARQL 1.1 draft document, nor the need to have repeated
>> mappings on the query result. Is there another document where I can
>> find more detailed examples or use cases? Thank you very much.
>>
>> With best regards,
>> Sebastián Conca
>>
>>
>>
>> El 20 de mayo de 2011 08:22, Andy Seaborne
>> <andy.seaborne@epimorphics.com <mailto:andy.seaborne@**epimorphics.com<andy.seaborne@epimorphics.com>
>> >>
>> escribió:
>>
>>    Sebastián,
>>
>>    The design of property paths is a balance and needs to integrate in
>>    with the overall evaluation of a SPARQL query. The equivalences you
>>    describe are not part of the property path specification.
>>
>>    Consider even a simple pattern like:
>>
>>    { ?x :p/:q ?y }
>>
>>    The working group has decide to make that equivalent to
>>
>>      SELECT ?x ?y { ?x :p ?V . ?V :q ?y }
>>
>>    that is, projecting ?V away but not requiring distinct results.
>>
>>    So to take one of your examples:
>>
>>
>>    SELECT * WHERE { :a (:p*)/(:p*) ?x }
>>
>>    SELECT ?x WHERE { :a (:p*) ?V .
>>                      ?V (:p*) ?x }
>>
>>    and the multiple cardinality of the overall query because each :p*
>>    stops following cycles but the second one is acting on all possible
>>    starting points.
>>
>>    The operations for path*, path{0} and path+ provide algebra
>>    operations and these can be combined using "/" and other path
>>    operators. There is no overall per-path expression condition on
>>    cardinality, instead the combination of operations leads to the
>>    cardinality.
>>
>>
>>    We would be grateful if you would acknowledge that your comment has
>>    been answered by sending a reply to this mailing list.
>>
>>    Andy, on behalf of the SPARQL-WG
>>
>>    On 07/04/11 19:42, Sebastián Conca wrote:
>>
>>        Dear Andy,
>>
>>        First, I am sorry for the long delay in answering your email.
>>
>>        Thank you very much for your reply. I read the Editor's Draft
>>        document
>>        that you mentioned, and I still have some concerns about the
>>        definition
>>        of the semantics of property paths; despite the fact that some
>>        of the
>>        counterintuitive results described in my previous mail were
>>        solved with
>>        the new definition of this semantics, there are still queries
>>        containing
>>        equivalent regular expressions that return different answers, as
>>        shown
>>        below. This time, the examples were not tested in ARQ, because
>>        the last
>>        version of ARQ is working with the semantics described on the
>>        Working
>>        Draft document, so I apologize if there are incorrect results in
>>        some of
>>        the examples (to the best of my understanding they are correct).
>>
>>        Consider the graph G:
>>
>>        :a :p :b
>>        :b :p :c
>>        :c :p :a
>>
>>        and the following query Q1:
>>
>>        SELECT * WHERE { :a (:p*) ?x }
>>
>>        The result of applying Q1 over G is:
>>
>>        ?x= :a, :b, :c
>>
>>        Now consider the query Q2:
>>
>>        SELECT * WHERE { :a (:p?)/(:p*) ?x }
>>
>>        The result of applying Q2 over G is:
>>
>>        ?x= :a, :b, :c, :b, :c, :a
>>
>>        Clearly the path properties used in the above queries are
>> equivalent
>>        (regular expressions). Notice that the operator * is not nested
>>        in the
>>        regular expressions in Q1 and Q2.
>>
>>        Now consider the following query Q3, containing a regular
>> expression
>>        that is equivalent to the expressions in the previous examples:
>>
>>        SELECT * WHERE { :a (:p*)/(:p*) ?x }
>>
>>        Now, the result of Q3 over G is:
>>
>>        ?x = :a, :b, :c, :b, :c, :a, :c, :a, :b.
>>
>>        I don't know how the user could interpret the above results.
>>
>>        I would really appreciate it if you could tell me your opinion
>> about
>>        these examples.
>>
>>
>>        Thank you very much.
>>        With best regards,
>>        Sebastián Conca
>>
>>        El 25 de marzo de 2011 09:07, Andy Seaborne
>>        <andy.seaborne@epimorphics.com
>>        <mailto:andy.seaborne@**epimorphics.com<andy.seaborne@epimorphics.com>
>> >
>>        <mailto:andy.seaborne@**epimorphics.com<andy.seaborne@epimorphics.com>
>>        <mailto:andy.seaborne@**epimorphics.com<andy.seaborne@epimorphics.com>
>> >>>
>>
>>        escribió:
>>
>>            Sebastián,
>>
>>            The details of the property path expressions is something
>>        the WG has
>>            been working on recently and there are changes in the
>>        specification
>>            since the last published working draft in response to
>>        comments from
>>            the community and from discussions with the working group.
>>        You can
>>            see the changes in the editors' draft at
>>        http://www.w3.org/2009/sparql/**docs/query-1.1/rq25.xml<http://www.w3.org/2009/sparql/docs/query-1.1/rq25.xml>
>> .
>>
>>            The path evaluation has changed recently to make it in-line
>>        with the
>>            decisions of the SPARQL-WG as the WG works through the exact
>>            specification of property paths.
>>
>>            For both queries:
>>
>>
>>            SELECT * WHERE { :a (:p+) ?x }
>>            SELECT * WHERE { :a :p/(:p*) ?x }
>>
>>            the results would be:
>>
>>            ------
>>            | x  |
>>            ======
>>            | :b |
>>            | :c |
>>            | :a |
>>            ------
>>
>>            The expression (:p+)+ is not equivalent in terms of
>>        cardinality of
>>            the elements although it should return the same elements.
>> SELECT
>>            DISTINCT may be useful.
>>
>>            We would be grateful if you would acknowledge that your
>>        comment has
>>            been answered by sending a reply to this mailing list.
>>
>>            Andy, on behalf of the SPARQL-WG
>>
>>
>>            On 18/03/11 20:04, Sebastián Conca wrote:
>>
>>                Dear All,
>>
>>                I have been trying some examples of SPARQL 1.1 property
>>        paths, and I
>>                have gotten some results that seem to be
>>        counterintuitive. For
>>                instance, consider a graph G:
>>
>>                :a :p :b
>>                :b :p :c
>>                :c :p :a
>>
>>                and the following query Q1:
>>
>>                SELECT * WHERE { :a (:p+) ?x }
>>
>>                According to the semantics proposed in the Working Draft
>>                document, the
>>                result of the query Q1 over G is:
>>
>>                ?x = :b, :c, :a
>>
>>                Now, consider a query Q2:
>>
>>                SELECT * WHERE { :a :p/(:p*) ?x }
>>
>>                According to the semantics proposed in the Working Draft
>>                document, the
>>                result of the query Q2 over G is:
>>
>>                ?x = :b, :c, :a, :b
>>
>>                I tested both queries in ARQ, getting the same results
>>        shown above.
>>                The paths used in the queries are equivalent regular
>>        expressions
>>                (the
>>                regular languages represented by (:p+) and :p/(:p*) are the
>>                same), so
>>                the results of these queries over G should be the same.
>>        Am I missing
>>                something?
>>
>>                I also executed in ARQ a third query Q3 containing a
>> regular
>>                expression that is equivalent to (:p+) and :p/(:p*):
>>
>>                SELECT * WHERE { :a (:p+)+ ?x }
>>
>>                But this time I got the result:
>>
>>                ?x = :b, :c, :a, :b, :c, :a, :b, :c, :a, :b, :c, :a, :b,
>>        :c, :a
>>
>>                What should be the interpretation of this result? I
>>        would really
>>                appreciate it if you could let me know whether I am missing
>>                something.
>>                Thank you very much.
>>
>>                With best regards,
>>
>>                Sebastián Conca
>>
>>
>>
>>
Received on Tuesday, 5 July 2011 16:01:43 UTC