federated query review, reprised

I've revisited my review of the federated query document based on 
Carlos's changes documented on 
http://www.w3.org/2009/sparql/wiki/To_Last_Call/Federated_Query_Review. 
This also discharges ACTION-433, as I took a look at the protocol 
service / endpoint terminology.

== Substantive Comments ==

These comments must be resolved before the document is ready for Last 
Call. Editorial comments follow after, which can be addressed either 
before or after Last Call, though many of them would help the 
readability of the document.

2.3 -- we can't link to a definition on a wiki. If we need that 
definition, it needs to be incorporated into this document. It also 
should be incorporated into the semantics section, rather than in the 
context of an example.

3.1

* The syntax transformation needs to handle "SILENT"

Definition: Evaluation of a Service Pattern

* Invocation(IRI, S, P, B) needs to be extended to take a 5th parameter, 
the boolean for silent invocation or not.

* Why are only variables in the intersection of vars and bound selected? 
This means that I can't do this:

SELECT ?x {
   SERVICE <iri> { ?x a foaf:Person }
}

...which seems strange to me. Is this intentional?

* The second call to "Invocation" needs to include silentOp

* " return unbounded variables." needs to be more formal. It needs to 
return a single solution with no bindings.

4 SPARQL 1.1 Federated Query Grammar -- this document only defines 
SERVICE, not BINDINGS. This section should be changed to reflect that.

5 Conformance

I'm still not sure that this conformance section makes sense here. I 
need to think more about what would make sense. Perhaps Eric or Axel has 
a suggestion here.

Remove section 7 completely.

== Editorial Comments ==

1. Introduction

Suggest rewording as follows:

"""
The SERVICE extension allows one to direct a portion of a query to a 
particular SPARQL query service, similar a GRAPH graph pattern, which 
"directs" queries to particular named graphs in the (local) dataset . 
This specification defines the syntax and semantics of this extension.
"""
=>
"""
This specification defines the syntax and semantics of the SERVICE 
extension to the SPARQL 1.1 Query Language. This extension allows a 
query author to direct a portion of a query to a particular SPARQL 
endpoint to be executed against local graphs. Results are returned to 
the federated query processor and are joined with results from the rest 
of the query.
"""

(and add a link from "SPARQL endpoint" to 
http://www.w3.org/2009/sparql/docs/protocol-1.1/Overview2.xml#terminology)


1.1.2 Result Descriptions

Suggest adding this and removing the similar sentence from 1.1.3:

"""
This example illustrates a solution sequence for a query that projects 
three variables, ?x, ?y, and ?z. The solution sequence has a single 
solution mapping in which ?x is bound to the plain literal "Alice", ?y 
is bound to the IRI http://example/a, and ?z is not bound.
"""

1.1.3 Terminology

Remove " (corresponds to the Concepts and Abstract Syntax term "RDF URI 
reference")" after "Solution Mapping".

Remove "In this result set..."


2 SPARQL 1.1 Basic Federation Extension

Suggest renaming section to "SPARQL 1.1 Federated Query Extension"

Suggest rewording the first paragraph based on the new introduction 
text, as follows:

"""
Queries over distributed SPARQL endpoints often involves querying one 
source and using the acquired information to constrain queries of the 
next source. This section illiustrates how this can be achieved using 
SPAQL1.1's SERVICE Graph patterns by examples.
"""

=>

"""
The SERVICE keyword instructs a federated query processor to invoke a 
portion of a SPARQL query against a remote SPARQL protocol service. This 
section presents examples of how to use the SERVICE keyword. The 
following sections define the syntax and semantics of this extension.
"""

2.1 Simple query to a remote SPARQL endpoint

Suggest rewording:

"""
This example shows how to query a remote SPARQL endpoint and join the 
returned data with the data at local RDF data store. Imagine we want to 
know who do I know. Data about people is in 
<http://people.example/sparql> endpoint:
"""
=>
"""
This example shows how to query a remote SPARQL endpoint and join the 
returned data with the data from the local RDF data store. Consider a 
query to find the names of the people I know. Data about the names of 
various people is available at the <http://people.example/sparql> SPARQL 
endpoint:
"""

2.1 and following -- the examples are much improved. There is still some 
inconsistency between the data in the remote endpoints and the URI of 
the endpoints. 2.1 uses "people.example" and 2.2 uses 
"people.example.org", for instance. It would be good to make these all 
match up to help the reader follow the examples without being confused.


2.2

I think this example needs a couple of changes:

* The URIs of the services in the description does not match the URIs in 
the query

* The query nests one SERVICE inside another. This requires the first 
service to be a federated query processor. This should be mentioned in 
the example, or the example changed so that the OPTIONAL is outside of 
the SERVICE clause (in which case the foaf:interest triples would need 
to be local).


2.3 Variable Services

"in the default graph" => "in the local default graph"

"""
When having variables for specifying the address of a SPARQL endpoint in 
a SERVICE operation this variable must be bounded.
"""
=>
"""
A variable used with the SERVICE keyword must be bound.
"""

(See comment above.)

2.4

"""
the query will stop returning the corresponding SPARQL error.
"""
=>
"""
the query will stop and return the error.
"""

2.4 - Can we add &nbsp; or something in the table cell in the results to 
make it render as a full-height empty cell to show that there is one row 
in which ?name is not bound?


2.5 As per 
http://lists.w3.org/Archives/Public/public-rdf-dawg/2011AprJun/0119.html, I'd 
remove this example completely and substitute the text in that email 
message. While SERVICE and BINDINGS _can_ be used in the same query, 
that is not the "natural" intent of things -- instead, the natural 
intent is that a federated query implementation would _generate_ queries 
with BINDINGS to help constrain the results from the remote endpoint. An 
example showing this would be helpful, but is probably too much before 
Last Call, so I'd remove this example entirely.


thanks,
Lee

Received on Thursday, 28 April 2011 03:10:06 UTC