protocol 1.1 review (to 2.3) from Andy Seaborne on 2011-08-05 (public-rdf-dawg@w3.org from July to September 2011)

From: Andy Seaborne <andy.seaborne@epimorphics.com>
Date: Fri, 05 Aug 2011 08:55:01 +0100
To: SPARQL Working Group <public-rdf-dawg@w3.org>
Message-ID: <4E3BA1D5.4020402@epimorphics.com>
This review only covers upto end of section 2.3.

Overall comments:

1/ Structure
I suggest having a formal definition section and an examples section.
Promote 2.4 (examples) to be new-3.

This leads 2 as the formal definition section, rather than 90% of the 
document.

2/ The use of a dataset description for update seems strange.
I've split discussion into a separate message.

3/ HTTP requests controlled by content type MUST set the content type; 
not SHOULD.  This occurs in all POST places.

== Introduction
Need to mention the different SPARQL 1.1 Graph Store HTTP Protocol.

Add para:
"""
A separate document describes the SPARQL 1.1 HTTP Graph Store Protocol. 
[links] for accessing and managing a collection of graphs in the REST 
architectural style.
"""

== section 1.1:
Delete prefixes table.

== section 1.2:
"service" is the abstraction; "endpoint" is the concrete place to talk 
to a service.  A service may have several endpoints.

Not sure we ned to define "SPARQL Protocol service" but if we do:

"""
SPARQL Service
A service that offers this protocol.
"""

"protocol" does not add anything (are there protocol-less services?)

The only real use I found was in 2.1.6 etc to talk about errors code. 
The plain word "service" or "SPARQL service" worked for me there.

"""
SPARQL endpoint
The URI at which a SPARQL service listens for requests
from SPARQL Protocol clients.
"""

"The" -> "A"

== Section 2:
General:
There is use of "MUST" and "encode" that are talking about HTTP.
This doc is not defining that requirement, it comes from HTTP.
Many people will be using HTTP through a library that may well handle 
many of the issues.

Suggest only using MUST for things SPARQL defines.


"defined ... conformant"

If conformant refers to this document, it seems superfluous as "define" 
does nor define non-conformant.
If conformant refers to HTTP, I'm not sure what the point being made is.

s/conformant//

== 2.1

Add CSV-TSV to list of result types.
"SPARQL 1.1 Query Results JSON Format"
"SPARQL 1.1 Query Results CSV and TSV Formats"

== 2.1.1

"client MUST URL encode"
-->
Not MUST as it's an HTTP requirement.
"The HTTP operation will need to be encoded according to the rules of 
RFC 2161."
The examples will show this.

"Query string parameters MUST be"
A library may well be sorting this out.

== 2.1.2

There are two sub-cases: HTML form encoding and POST of a query string. 
  I think you can mix as well, although it's rare.

The section is actually about HTML form encoding, but says "URL-encoding".

No need for a mini-tutorial on how to build an HTML form - reference 
HTML4 (http://www.w3.org/TR/html4/interact/forms.html) for the normative 
text, show in examples.

== 2.1.3

Client MUST include content-type "application/sparql-query" (not SHOULD)

Ought to note that the default (and only) valid charset encoding is UTF-8.

== 2.1.4

s/theFROM/the FROM/

Ideally, the fact that the datset can be determined by endpoint would be 
first.  It's the common case.

Might be good to stress that a processor is not required to support 
dataset descriptions:

"""A service MAY reject a query, using response code 400, if the service 
only accepts queries without a dataset description in the protocol or 
query itself
"""

== 2.1.5, 2.1.6

Some general statement to the effect of "do the right thing by HTTP"

e.g. somewhere

"""
The SPARQL 1.1 Protocol is built on top of HTTP.  All HTTP requirements 
for requests and responses MUST be followed.
"""

Add CSV TSV

Add Turtle (link to submission :-() and N-triples.

=== 2.1.7

"should be" unemphasised.

This was tricky last time.  403 is a reasonable response as well.

Maybe

"The HTTP response code for an unsuccessful query operation should be:"
==>
"The HTTP response codes applicable to an unsuccessful query operation 
include:"

400 is also the right code for e.g. a query with FROM when the processor 
only accepts queries against the implicit dataset.  It's a client error. 
  It's not just bad syntax.

== 2.2 Update

"""
The update operation is used to send a SPARQL update
request to a service and receive the results of the request.
"""
There are no results. Delete from "and".

Why do we have default-graph-uri and named-graph-uri?
What do they do?

** ==> USING
This is new.

== 2.2.1
See 2.1.2
Content header MUST be ....

== 2.2.2
MUST set the content type.

"as query string parameters"
This is update, not query.


== 2.2.3
Update isn't like query because an update is multiple operations and 
also updates are about state change.

See overall comment.

== 2.2.4, 2.2.5
See 2.1.6, 2.1.7

Stronger text about the response to a successful update operation being 
(by spec) empty.  There is no response to a update operation defined in 
the update language.  I'm worried that leaving it implementation defined 
might lead to expectations of query responses from an update.

This is a spec - just don't say anything about implementation defined 
features.

== 2.3

This section does not seem to come to a conclusion.  Could if have a 
list of where the base URI comes from in SPARQL terms?

So we have

1/ current BASE in query or update (5.1.1, 5.1.2 depending on how you 
look at it)
2/ Service endpoint (5.1.3)

---------

I think 5.1.3 does apply - it's the request URI.  "retrieve" is used 
because the RFC text is slanted towards GET and also POST into 
containers.  Then 5.1.4 never applies - it covers non-protocol cases.

(The request URI is often unhelpful because it includes the query string 
but that is what the std says unfortunately.)


 Andy
Received on Friday, 5 August 2011 07:55:39 UTC