Re: Comments on SPARQL draft (pt. 1) from Seaborne, Andy on 2005-03-21 (public-rdf-dawg-comments@w3.org from March 2005)

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Mon, 21 Mar 2005 14:08:57 +0000
To: Danny Ayers <danny.ayers@gmail.com>
CC: public-rdf-dawg-comments@w3.org
Message-ID: <423ED579.5020003@hp.com>
Danny Ayers wrote:
> Looking through the 2005-03-17 editor's draft, the doc has *growed* so
> I've only done the query stuff here, I'll look at Result Forms etc
> later.
> 
> Overall the doc's really nice, it works both as a spec and a tutorial
> without flab. I haven't been following the list, so apologies where
> things have been explained/worked through in discussions I've missed.
> I've only a couple of coarse grained comments on the language itself,
> so I'll get them out of the way before editorial bits/minor language
> points.
> 
> *** Language features ***
> INSERT
> I'm sure this has been discussed at length, but I can't ignore it -
> where's the INSERT? Ok, I can see there may be the argument that it's
> not really a query, and there would be be implementation issues. But
> aside from the basic utility, there will be bound to be expectations
> of people coming to SPARQL from SQL, and the absence of something so
> fundamental is likely to attract criticism. Surely all that would be
> needed is INSERT {graph pattern} and it's done..? (I'd also expect
> this to be reflected in the Protocol, perhaps with PostGraph/PutGraph
> constructs).

There is obviously a linkage between INSERT and protocol.  While some form of 
INSERT is reasoably clear, DELETE is rather more tricky, given blank nodes and 
complex structures.

http://www.w3.org/2003/12/swa/dawg-charter#update

> 
> OPTIONAL
> The doc made me chuckle - soon after the material on OPTIONAL there's
> an editorial comment that there was an objection to UNION on the
> grounds that it would complicate implementation and discourage
> adoption. If UNION will discourage adoption then OPTIONAL will have
> people emigrating to Mars. I don't know whether it's just my addled
> neurones, or the way it's presented in the doc, or in the language
> design (I suspect the latter), but I found this construct incredibly
> hard to comprehend. In case I still haven't understood it properly, I
> won't go further than to ask - is this functionality particularly
> useful? If so, can't it be done in a simpler fashion?

OPTIONAL arises from the semi-structured nature of RDF and is one of teh most 
common requests from users.  Consider wishing to write a query that extracts 
name and nick FOAF information about a person where the mbox is known:

This fails to do the job:

SELECT ?name ?nick
{
   ?x foaf:mbox <mailto:...> ;
      foaf:name ?name ;
      foaf:nick ?nick .
}

because both ?name and ?nick would have to be present.

SELECT ?name ?nick
{
   ?x foaf:mbox <mailto:...> .
   OPTIONAL { ?x foaf:name ?name } .
   OPTIONAL { ?x foaf:nick ?nick } .
}

This query will work, giving one row per ?x found, with name and nick 
information if available.

SQL has outer joins.

Using UNION would give 4 rows, not one, when ?name and ?nick were both present. 
    That makes it very hard for the application to take the results and process 
them back to what it wants.

> 
> *** Editorial/minor points ***

I've made the relevant corrections - a little discussion inline.

> There seems to be a little inconsistency in the description early on
> when in comes to URIs -
> [[[
> 2.1
> Query Term Syntax
> The terms delimited by "<>" are URI relative references
> ...
> The term "URI" in this document should be read as "absolute URI".]]]

Yes - <foo> is syntax and has to have resolved according to the usual rules in 
RFC 3986.  Queries themselves only have absolute IURIs in.

The phrase is "URI relative reference" as defined in RFC 3986.  I'mm <em> it to 
stress the connection.

> 
> also in the same section:
> [[[Single quotes ('')  are also allowed. ]]]

as string delimiters - made clearer.

> 
> I'm not certain, but (as a sub-delim) couldn't the single quote appear
> within a URI?
> 
> [[[
> Turtle allows URIs to be abbreviated with prefixes:
> ]]]
> 
> Moving on -
> [[
> 2.2 Graph Patterns
> Definition: RDF Term
> 
> An RDF Term is anything that can occur in the RDF data model.
> ]]
> "anything" seems vague
> 
> [[
> An RDF triple contains three components:
> 
>     * the subject, which is an RDF URI reference or a blank node
> ]] 
> I think this needs integrating better with the general description in
> relation to URIs and the comment re. literal subjects

It's text from from RDF concepts.

> 
> [[
> 2.3 Graph Pattern Matching
> ...
> Definition: Restriction
> ]]
> I found this particular definition confusing, especially as
> "Restriction" doesn't seem to be used in the section that follows.
> 
> [[
> 2.4 Examples of Graph Patterns
> ... tripe patterns ...
> ]]
> juvenile-amusing typo
> 
> [[
> 2.7 Other Syntactic Forms
> SPARQL uses a "Turtle-like" syntax for writing basic graph patterns. 
> ]]
> I think an explanation of how the syntax differs from Turtle is needed
> 
> [[
> Blank Nodes
> ]]
> The text layout with examples mid-sentence is a little confusing.
> 
> [[
> 3.1 Matching RDF Literals
> ...
> Matching Arbitrary Datatypes
> ...note that the query processor does not have to have any
> understanding of the values in the space of the datatype.
> ]]
> Huh?

RDF data can include datastypes for which there is no explicit support in the 
query processor.  e.g. Geo data.

> 
> [[
> 3.2 Constraining Values
> ...
> Note that a constraint can be considered to be a triple with a special
> predicate.
> ]]
> If it's worth noting, it's worth explaining.
> 
> [[
> 4 Combining Patterns
> ...
> A Basic Graph Patterns is, 
> ]]
> typo
> 
> [[
> 5.5
> 5.5 Nested Optional Blocks
> ...
> Query:
> 
> PREFIX foaf: <http://xmlns.com/foaf/0.1/>
> PREFIX vcard: <http://www.w3.org/2001/vcard-rdf/3.0#>
> SELECT ?foafName ?gname ?fname
> ]]
> Shouldn't the SELECT clause include ?mbox ?
> 
> (maybe not - I've not grokked OPTIONAL)
> 
> [[
> 6.1 Joining Patterns with UNION
> ]]
> The layout of the examples in this section,with multiple statements on
> one line is harder to read.
> 
> [[
> 7 RDF Dataset
> ]]
> This section would be really confusing to anyone that hadn't
> previously encountered named graphs.
> 
> [[
> ...
> A query processor is not required to support named graphs.
> ...
> 8 Querying the Dataset
> ]]
> It's not clear what the expectations are - maybe a separate section is
> needed to be explicit about what a query processor SHOULD support.
> 
> [[
> 8.1 Accessing Graph Labels
> ...
> It is not necessary to use the GRAPH clause to create the data graph
> for a collection of graphs. The query environment may provide the RDF
> dataset to be queried.
> ]]
> How?

Out of date text.  Removed.

> 
> [[
> 8.3 Restricting via Query Pattern
> ]]
> 3 typos in the first 2 paragraphs, 1 in the last - someone was in a hurry ;-)

"Editors' draft" as in live draft.  As in checking in stuff at the end of a day 
so the co-editor can edit.

> 
> Language point: In the same section, I found the syntax
> [[ GRAPH data:bobFoaf {... ]]
> confusing, it looks like that's a property. I don't know, maybe it
> would be better to insist that graph names be written in full..?
> 
> [[
> 8.4 GRAPH and a background graph
> ...has found read in...
> ...a aggregator...
> ]]
> typos, and the title doesn't seem right somehow
> 
> [[
> 8.5 Definition for GRAPH
> Definition: DatatSet Graph Pattern
> ...
> ]]
> could this be clearer?
> 
> [[
> 9 Query Execution and Ordering
> ...
> ]]
> As noted, this section does need filling out. But -
> 
> Language point: is there/should there be any way of explicitly
> overriding default execution order?

The default order is supposed to define the right answers.  How a processor 
achieves that is up to it - maybe it will choose to issue errors on queries that 
have a different order - may be it will execute the specifications in a way that 
gets the right answer.  It could issue a warning and execute in query order.

SQL queries can be order dependent (where there are outer joins).  The right 
answer is defined to be executing in the order the query writes it.

Optimizing compilers do much the same.

	Andy
Received on Monday, 21 March 2005 14:21:39 UTC