Re: SPARQL: Editorial comments on Last Call WD [OK?] from Seaborne, Andy on 2005-11-07 (public-rdf-dawg-comments@w3.org from November 2005)

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Mon, 07 Nov 2005 15:29:48 +0000
To: Ivan Herman <ivan@w3.org>
CC: public-rdf-dawg-comments@w3.org
Message-ID: <436F72EC.4050503@hp.com>
Ivan Herman wrote:
> Dear all,
> 
> 
> all these comments (maybe with the exception of the last one) are small scale
> editorial, a.k.a. hair-splitting:-) and is more for readability.

Ivan,

Thank you for the comments.

> 
> Ivan
> 
> --------------------------------------------------------------------
> 
> IRI misuse issue (Section 2.1)
> 
> I wonder whether the text on IRI misuse (Section 2.1, Query Term Syntax) should
> really be part of a normative text. Some other W3C recommendations separate in
> the text the 'normative' and 'informative' parts; if this was done in SPARQL,
> this would certainly be an 'informative' paragraph... However, SPARQL does not
> do that separation, ie, everything is normative.

The text in this section has been reworked as part of various comments.

The text on multiple IRIs that may have the same appearance (I take it this is
what you mean by "misuse") has been placed in a separate Security
Considerations section.

> 
> --------------------------------------------------------------------
> 
> 
> Wrong link text (Section 2.2)
> 
> Section 2.2 second paragraph after the first definition says: [3987, sec. 3.1].
> I presume '3987' should refer to a RFC 3987 or, more appropriately, to reference
> no. 19

Fixed

> 
> --------------------------------------------------------------------
> 
> "Matching dataset DS" (Section 2.4)
> 
> The formal definition in 2.4 says: "S is a pattern solution for GP matching
> dataset DS". There is no formal definition of what "matching" means at that
> point (it is defined in 2.5 later, ie, a 'post-definition'...). I think moving
> the sententence defining matching from 2.5 to here is better and clearer.

"matching" is used in all the different graph pattern types so I have
mentioned this as covered in various sections to follow this one.  As
solutions are used in matching there is a cirularity that is hard to avoid and
still indicate what solutions are used for.

> 
> --------------------------------------------------------------------
> 
> Editorial issue on Group Graph Pattern (Section 4.1)
> 
> Reading the spec from start to end, and getting to 4.1 gives a stange impression
> on groups. The whole section seems to argue that, in fact, groups are just
> syntactic sugar because they can be flattened, and that is all the section says.
> If my understanding is correct, groups become important when combined with, say,
> Optionals within the groups, but this comes much later in the document. For the
> sake of readability some sort of an explanation would be welcome here.

I have removed the use of "syntactically" which was confusing and moved that
sentence so it is not first.

> 
> --------------------------------------------------------------------
> 
> Unbound variables in a solution
> 
> Section 4.2 says: "Solutions to graph patterns do not necessarily have to have
> every variable bound in every solution that causes a graph pattern to be
> matched. In particular, the OPTIONAL and UNION graph patterns can lead to query
> results where a variable may be bound in some solutions, but not in others."
> 
> With my English I could interpret this sentence as follows: I could have a graph
> pattern match *without* OPTIONAL or UNION where not all variables are bound.
> That is probably not the intention, and would be in contradiction with the
> remark at the very end of 2.6 which says "This is a simple, conjunctive graph
> pattern match, and all the variables used in the query pattern must be bound in
> every solution."
> 
> I think it should be clearly stated somewhere in the document that *unless*
> OPTIONAL and UNION is used, all variables MUST have a binding in a solution.

I agree it's not very clear - a variable not mentioned in a basic pattern will
remain untouched (bound/unbound), and this is significant because OPTIONAL and
UNIONS are combining patterns which are themselves basic patterns.

Changed to:

[[
Solutions to graph patterns do not necessarily have to have every variable
bound in every solution. SPARQL query patterns are built up from basic
patterns which do associate RDF terms with each variable mentioned in the
pattern; OPTIONAL and UNION graph patterns can lead to query results where a
variable may be bound in some solutions, but not in others.
]]

> 
> --------------------------------------------------------------------
> 
> Matching Literals in lexical or value space (Section 3.1)
> 
> Section 3.1 does not say whether matching datatype literals is done in lexical
> or in value space. Ie, if the data is:
> 
> :a :b "10.00000000"^^xsd:double
> 
> and the query is
> 
> WHERE { ?x :b "10.0"^^xsd:double }
> 
> Do I get ":a" as a solution or not?

One of the issues the working group has had to deal with is that both cases of
matching, with and without RDF D-entailment are reasonable.  There is
no requirement that RDF datatype entailment is supported, nor is there a
prohibition that it is not supported and the spec is intentionally not
defining one way of the other.

Another example is

"10"^^xsd:integer and "X"^^roman:Numeral.  Same value - matched if the
processor does D-entailment for roman:Numerals.

RDF semantics says:

[[
These rules may not provide a complete set of inference principles for
D-entailment, since there may be valid D-entailments for particular datatypes
which depend on idiosyncratic properties of the particular datatypes ...
]]

> 
> SPARQL may be oblivious to this and it may depend on the RDF data store. If that
> data store does RDFS-D entailement, than the existence of the the
> 
> :a :b "10.0"^^xsd:double
> 
> is inferred and the query will return ":a". If not, there is no match. However,
> something should be said in the SPARQL document (note that section 11, Testing
> Values, does not answer this because it refers to FILTERS-s only.)

It does apply in the case of RDF term equality: we have a test case for this:

http://www.w3.org/2001/sw/DataAccess/tests/data/ValueTesting/roman.rq

(ARQ now can run with or without Roman Numeral support in FILTERs - by default
it's turned off :-)

> 
> [The float case is relatively simple, but more complex issues arise, for
> example, with XML Literals]
> 

You're right it coudl be clearer and it is worth picking out explicitly.  I've 
added at the end of 3.1

[[
Matching with RDF D-Entailment

RDF defines D-Entailment. When matching RDF literals in graph patterns, the
datatype lexical-to-value mapping may be reflected into the underlying RDF
graph, leading to additional matches where it is known that two literals are
the same value.
]]

> 
> --
> 
> Ivan Herman
> W3C Communications Team, Head of Offices
> C/o W3C Benelux Office at CWI, Kruislaan 413
> 1098SJ Amsterdam, The Netherlands
> tel: +31-20-5924163; mobile: +31-641044153;
> URL: http://www.w3.org/People/Ivan/


If this addresses the comments rasied, please respond with [CLOSED] in
the subject to allow the issue tracking scripts to close this issue.

 Andy
Received on Monday, 7 November 2005 15:30:33 UTC