Re: Comments on SPARQL WD from Seaborne, Andy on 2007-04-07 (public-rdf-dawg-comments@w3.org from April 2007)

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Sat, 07 Apr 2007 18:36:13 +0100
To: Olivier.Corby@sophia.inria.fr
Cc: public-rdf-dawg-comments@w3.org
Message-ID: <4617D68D.4010608@hp.com>
Olivier.Corby@sophia.inria.fr wrote:
> Hi,
> 
> Here are some comments on current SPARQL WD.
> 
> Best regards,
> 
> Olivier
> 

Olivier - many thanks for the comprehensive review.

Responses and corrections done are noted inline.

Changes made to editors' working draft at:
http://www.w3.org/2001/sw/DataAccess/rq23/rq25.html v1.80

One comment remain (11.4.10 RDFterm-equal about use of xsd:dateTime in the 
example).  I'll defer to Eric on that one.

> 2.3.2 Matching Numeric Types
> 
> 
> SELECT ?v WHERE { ?v ?p 42 }
> v
> <http://example.org/ns#x>
> 
> should be:
> 
> <http://example.org/ns#y>
> 

Corrected

> 
> 
> 2.3.3 Matching Arbitrary Datatypes
> 
> The following query has a solution with variable v bound to :y.
> 
> should be: bound to :z.
> 
> 
> SELECT ?v WHERE { ?v ?p
> "abc"^^<http://example.org/datatype#specialDatatype> }
> v
> <http://example.org/ns#y>
> 
> should be :
> 
> <http://example.org/ns#z>
> 
> 

Corrected *2

> 
> 
> 2.5 Building RDF Graphs
> 
> The SELECT query form returns tabular information.
> 
> Could be:
> 
> The SELECT query form returns variable bindings.
> 
> [[Because we may exploit the results through an API and hence there may be
> no tabular form]].

Corrected

> 
> 3.3 Other Term Constraints
> 
> 
> xsd:strings, xsd:booleans and xsd:dateTimes
> 
> [[It look strange to add an 's' to the xsd datatype names]]

reworded as "SPARQL supports types xsd:string, xsd:boolean and xsd:dateTime"

> 
> 
> 
> 4.1.1 Syntax for IRIs
> 
> behavoir -> behavior
Done

> 
> prefixedname -> prefixed name
> 
Done

> 4.2.3 RDF Collections
> 
> [[Suggestion : the (1 ?x 3 4) notation for rdf list could not generate the
> _:b rdf:rest rdf:nil triple; in such a way we could match lists that may
> have more elements.]]

The syntax of SPARQL triple patterns, and in particular of the syntactic
sugar forms given in section 4.2 (
http://www.w3.org/TR/rdf-sparql-query/#QSynTriples ), is designed to be
consistent with the syntax of Turtle (
http://www.dajobe.org/2004/01/turtle/ ). In Turtle, the (1 2 3) syntax
specifies a closed collection (i.e. one terminated by an rdf:nil triple),
and SPARQL adopts this meaning for this bit of syntax.

You can match a subset of a collection by explicitly including
the component pieces of the collection explicitly:

     _:b0  rdf:first  1 ;
           rdf:rest   _:b1 .
     _:b1  rdf:first  ?x ;
           rdf:rest   _:b2 .
     _:b2  rdf:first  3 ;
           rdf:rest   _:b3 .
     _:b3  rdf:first  4 .

> 
> 
> 5.2.2 Scope of Filters
> 
> A constraint, expressed by the keyword FILTER, is a restriction on
> solutions over the whole group in which the filter appears.
> 
> [[What happens if a variable in the filter is not bound in the current
> group, but is bound in another group  after current group? Does the filter
> fail? ]]

As per the text you've cited and the algebra given in Section 12, Filters
are evaluated in terms of the solution mappings of the group in which the
filter appears. If a variable in the filter is unbound in a solution
mapping, then as per Section 11.2 any function or operator (except
bound()) on the variable produces a type error. If this propagates to the
top of the filter, then the solution is removed from the multiset of
solution mappings.

> 
> 
> 7 Matching Alternatives
> 
> The UNION pattern combines graph patterns; each alternative possibility
> can contain more one triple pattern:
> 
> -> more than one triple pattern:

Done

> 
> 9.1 ORDER BY
> 
> Using ORDER BY on a solution sequence for a CONSTRUCT or DESCRIBE query
> has no direct effect because only SELECT returns a sequence of results.
> 
> [[
> This is in contradiction with :
> 
> 10.2.3 Solution Modifiers and CONSTRUCT
> The solution modifiers of a query affect the results of a CONSTRUCT query.
> In this example, the output graph from the CONSTRUCT template is formed
> from just two of the solutions from graph pattern matching. The query
> outputs a graph with the names of the people with the top two sites, rated
> by hits.
> ]]

The next sentence in 9.1 follows on with the (indirect) effect of ORDER BY - 
it has an observable effect only when LIMIT and OFFSET slice the solution 
sequence.

"""
Used in combination with LIMIT and OFFSET, ORDER BY can be used to return 
results generated from a different slice of the solution sequence.
"""

> 
> 9.3 DISTINCT
> 
> What is the solution of :
> 
> select distinct ?y where {
> 	?x c:friend ?y
> }
> 
> on this graph :
> 
> ex:Jules c:friend ex:Jimmy
> ex:Jules c:friend ex:James
> ex:Jimmy owl:sameAs ex:James

SPARQL is defined for simple entailment, with no notion of the semantics
of owl:sameAs. The result of this query is:

   ?y
--------
ex:Jimmy
ex:James

The conditions for extending SPARQL to other entailment regimes are given in:
http://www.w3.org/TR/rdf-sparql-query/#sparqlBGPExtend

> 
> 11.3 Operator Mapping
> 
> xs:integer, xs:decimal, xs:float, xs:double
> 
> should be :
> 
> xsd:integer, xsd:decimal, xsd:float, xsd:double

Corrected - and checked there are no other "xs:" anywhere in the document

> 
> operator =
> 
> [[
> I think there should be an = operator for plain literals because
> otherwise, RDFterm-equal applies, in which case :
> 
> 11.4.10 RDFterm-equal
> 
> RDFterm-equal produces a type error if the arguments are both literal but
> are not the same RDF term.
> 
> 
> So, following the current draft :
> 
> 'Jules'@en = 'Jim'@en
> 
> produces a type error because they are not the same RDF term.
> 
> It is the same with != < <= > >=
> ]]

By design, SPARQL does not define semantics for comparing plain literals
with language tags, as you've noted. Note that query writers can compare
literals themselves:

   langMatches(lang(?a), lang(?b)) &&
   langMatches(lang(?b), lang(?a)) &&
   (str(?a) = str(?b))

The Working Group has not been motivated in the past to define the
comparison operators on plain literals with language tags.

Section 11.3.1 "Operator Extensibility" licenses SPARQL language extensions 
that add rows to the operator table, such that implementations may support 
comparing plain literals with language tags.

I hope that this addresses this concern; please let us know if not, and
the Working Group will weigh changes to the operator table versus the
schedule risks of publishing a new Last Call draft.

> 
> SPARQL Tests, defined in section 11.4
> 
> [[regex could operate on xsd:string as well as simple literal.]]

Similarly, an implementation can add regex working on xsd:strings because it 
is a type error and can be enhanced.

> 
> 11.4.7 datatype
> 
> if the the parameter -> if the parameter

Done

> 11.4.10 RDFterm-equal
> 
> 
>     * term1 and term2 are equivalent IRIs as defined in 6.4 RDF URI
> References.
>     * term1 and term2 are equivalent literals as defined in 6.5.1 Literal
> Equality.
>     * term1 and term2 are the same blank node as described in 6.6 Blank
> Nodes.
> 
> 
> 
> [[There are no such 6.4, 6.5.1 and 6.6 sections in the current document,
> so the wording looks strange (in particular when you print the document)]]

Agreed - I've added: "of [CONCEPTS]" to stress the remote link:
e.g.

  * term1 and term2 are equivalent IRIs as defined
    in 6.4 RDF URI References of [CONCEPTS].
  * term1 and term2 are equivalent literals as defined
    in 6.5.1 Literal Equality of [CONCEPTS].
  * term1 and term2 are the same blank node as described
    in 6.6 Blank Nodes of [CONCEPTS].

> 
> 
> [[I think that the 2nd example of RDFterm-equal on dateTime is not
> appropriate here because in this case, the = operator on dateTime applies
> and hence not RDFterm-equal()]]
> 

I defer to Eric on this point.

> 
> 12.1.5 Basic Graph Patterns
> 
> 
> A Basic Graph Pattern is a set of Triple Patterns.
> 
> [[SPARQL parsers are required to remove duplicate triples?]]

They are not required to remove duplicates.
1/ {a a} is the same set as {a}, just not canonical.
2/ Solving for a bag would yield the same answers.

> 
> 12.2.1 Converting Graph Patterns
> 
> 
> 
> If the element consists of multiple GroupGraphPatterns then connected with
> 'UNION' terminals, then replace with a sequence of nested union operators:
> 
> ->
> 
>  GroupGraphPatterns connected with 'UNION' terminals [[remove then]]

Done

> 
> 12.2.2 Examples of Mapped Graph Patterns
> 
> 
> Example: group consisting of a basic graph pattern, a filter and an
> optional graph pattern:
> 
> { ?s :p1 ?v1} FILTER (?v1 < 3 ) OPTIONAL {?s :p3 ?v3} }
> Filter( ?v1 < 3 ,
>   LeftJoin( BGP(?s :p1 ?v1), BGP(?s :p2 ?v2), true) ,
>   )
> 
> 
> should be [[replacing 2 by 3]]:
> 
> 
> { ?s :p1 ?v1} FILTER (?v1 < 3 ) OPTIONAL {?s :p3 ?v3} }
> Filter( ?v1 < 3 ,
>   LeftJoin( BGP(?s :p1 ?v1), BGP(?s :p3 ?v3), true) ,
>   )
> 

Fixed by replacing 3 by 2 in the input.

{ ?s :p1 ?v1} FILTER (?v1 < 3 ) OPTIONAL {?s :p2 ?v2} }

Filter( ?v1 < 3 ,
   LeftJoin( BGP(?s :p1 ?v1), BGP(?s :p2 ?v2), true) ,
   )

> 
> 
> 
> 12.2.3 Converting Solution Modifiers
> 
> 
> M := OrderBy(M, list of of order comparators)
> 
> ->
> 
> list of  order

Done

> 
> 12.3.2 Treatment of Blank Nodes
> 
> 
> Since SPARQL treats blank node IDs in the answer document as scoped to the
> document ...
> 
> [[There may be no answer document when the result is accessed through an
> API, so this reference to answer document looks strange.]]

We need a name for the whole answer, whether variables bindings or graph. 
SPARQL defines the query language and the protocol, and the protocol returns 
documents - APIs are not directly considered in the spec.  The API is 
accessing the (probably notional) answer document.  APIs are not required to 
do anything particular but it would seem natural if they behaved the same 
whether accessing a local query implementation or deserialized results from a 
remote query service.

> 
> 12.4 SPARQL Algebra
> 
> 
> Definition: Filter
> 
> has a effective boolean -> has an effective boolean
> 
> Definition: Diff
> 
> { &#956; | &#956; in &#937;1 such that for all &#956;&#8242; in &#937;2,
> &#956; and &#956;&#8242; are not compatible }
> set-union
> { &#956; | &#956; in &#937;1 such that for all &#956;&#8242; in &#937;2,
> &#956; and &#956;' are compatible and expr(merge(&#956;, &#956;')) is
> false }
> 
> 
> [[
> I think the second member of union should be :
> 
> { &#956; | &#956; in &#937;1 such that for all &#956;&#8242; in &#937;2,
> &#956; and &#956;' are compatible => expr(merge(&#956;, &#956;')) is
> false}

You suggest replacing "and" by "=>"

It should be "and" - both parts must be true.  It's not a logical implication 
inside the conditon.

> 
> in addition it could be said that: expr(..) is false or returns an error

Changed to
expr(...) has an effective boolean value of false which covers the error case.

> ]]
> 
> Definition: LeftJoin
> 
> 
>     { merge(&#956;1, &#956;2) | &#956;1 in &#937;1 and &#956;2 in &#937;2,
> and &#956;1 and &#956;2 are compatible and expr(merge(&#956;1,
> &#956;2)) is true }
> set-union
>     { &#956;1 | &#956;1 in &#937;1 and &#956;2 in &#937;2, and &#956;1 and
> &#956;2 are not compatible }
> set-union
>     { &#956;1 | &#956;1 in &#937;1 and &#956;2 in &#937;2, and &#956;1 and
> &#956;2 are compatible and expr(merge(&#956;1, &#956;2)) is false }
> 
> 
> 
> [[
> I think member 2 and 3 of union should be :
> 
>     { ... }
> set-union
>     { &#956;1 | &#956;1 in &#937;1 such that for all &#956;2 in &#937;2, 
> &#956;1 and &#956;2 are not compatible }
> set-union
>     { &#956;1 | &#956;1 in &#937;1 such that for all &#956;2 in &#937;2, 
> &#956;1 and &#956;2 are compatible => expr(merge(&#956;1, &#956;2)) is
> false or returns an error}
> 
> ]]
> 

As above.


> 12.5 Evaluation Semantics
> 
> 
> 
> Definition: Evaluation of Join(P1, P2, F)
> 
> eval(D(G), Join(P1, P2)) = Join(eval(D(G), P1), eval(D(G), P2))
> eval(D(G), Join(P1, P2), F) = Filter(F, Join( eval(D(G), P1), eval(D(G),
> P2) ) )
> 
> 
> [[
> I think there should be one rule for Join(P1, P2) without F, and one rule
> for Filter(F, P)
> ]]

corrected as suggested - that was a previous style of having joins to be like 
LeftJoin.


> 
> 
> 
> 
> 
> Definition: Evaluation of LeftJoin(P1, P2)
> 
> should be:
> 
> Definition: Evaluation of LeftJoin(P1, P2, F)

Corrected

> 
> Definition: Evaluation of a Graph Patten
> 
> 
> eval(D(G), Graph(IRI,P)) = eval(D(D[i]), P)
> 
> should be [[replacing i by IRI]]:
> 
> eval(D(G), Graph(IRI,P)) = eval(D(D[IRI]), P)
> 

Corrected

> 
> 
> 
> 12.6 Extending SPARQL Basic Graph Matching
> 
> 
> mappijng  ->  mapping

Corrected

> 
> 

	Thank you once again
	Andy

-- 
Hewlett-Packard Limited
Registered Office: Cain Road, Bracknell, Berks RG12 1HN
Registered No: 690597 England
Received on Saturday, 7 April 2007 17:36:28 UTC