Re: protocol 1.1 review (to 2.3) from Andy Seaborne on 2011-08-06 (public-rdf-dawg@w3.org from July to September 2011)

From: Andy Seaborne <andy.seaborne@epimorphics.com>
Date: Sat, 06 Aug 2011 23:23:12 +0100
To: Lee Feigenbaum <lee@thefigtrees.net>
CC: SPARQL Working Group <public-rdf-dawg@w3.org>
Message-ID: <4E3DBED0.4050707@epimorphics.com>
On 06/08/11 21:48, Lee Feigenbaum wrote:
> Thanks very very much, Andy. (The changes mentioned below should be
> reflected in CVS.)


>> == Introduction
>> Need to mention the different SPARQL 1.1 Graph Store HTTP Protocol.
>>
>> Add para:
>> """
>> A separate document describes the SPARQL 1.1 HTTP Graph Store Protocol.
>> [links] for accessing and managing a collection of graphs in the REST
>> architectural style.
>> """
>
> I've taken this for now but might delegate to the overview document
> eventually.

Minor: I still thing it needs a sentence so when read independently, 
it's clear this is not the graph protocol.


>> == Section 2:
>> General:
>> There is use of "MUST" and "encode" that are talking about HTTP.
>> This doc is not defining that requirement, it comes from HTTP.
>>
>> Many people will be using HTTP through a library that may well handle
>> many of the issues.
>>
>> Suggest only using MUST for things SPARQL defines.
>
> It's not clear to me what to reference to include these bits
> normatively. The must requirements also tend to go a bit beyond what
> HTTP might mandate, such as specifying the names of the parameters.

>> == 2.1
>>
>> Add CSV-TSV to list of result types.
>> "SPARQL 1.1 Query Results JSON Format"
>> "SPARQL 1.1 Query Results CSV and TSV Formats"
>
> I've done this, but not sure if it matters whether or not CSV/TSV ends
> up as rec track.

Good news: the CSV and TSV formats do not define new media types else 
there would be problems.  As such, I'd say the note is merely 
documenting some behavior and not defining.  (wriggle)

>> == 2.1.1
>>
>> "client MUST URL encode"
>> -->
>> Not MUST as it's an HTTP requirement.
>
> The MUST refers to the full rest of the sentence, not just this phrase.

Sure - but it could be read a SPARQL requires an encoding - then HTTP is 
going to encode it.  i.e. it's a SPARQL-MUST.

I think MUST etc should (:-) only be used when it's this spec defining 
the requirement.  The requirement here is "use HTTP correctly".

"and MUST include" makes it clear as to MUST scope.

Additional: 2.1.2
"When using this method"
==>
"When using this POST method"
2.1.3 ditto.

Is "this" the HTTP method of the subject of the section (SPARQL "method" 
i.e. form of POST).

>> "The HTTP operation will need to be encoded according to the rules of
>> RFC 2161."
>
> I'm still not sure if this is right. Maybe I just am doing bad searches,
> but I can't find talk in http://tools.ietf.org/html/rfc2616 of how to
> build a request URI query string from a set of key/value pairs.
>
>> The examples will show this.
>>
>> "Query string parameters MUST be"
>> A library may well be sorting this out.
>
> That's ok, but irrelevant to our spec, right?

As above - risk of meaning that SPARQL-encodes ... then HTTP is going to 
encode again.

>> == 2.1.2
>>
>> There are two sub-cases: HTML form encoding and POST of a query string.
>> I think you can mix as well, although it's rare.
>>
>> The section is actually about HTML form encoding, but says
>> "URL-encoding".
>
> I don't understand what you mean here. Forms use URL encoding.

Text is:

"clients must URL encode all parameters and include them"

Read "and" as do one thing, do next thing" i.e. in the sense of "and 
then include ..." and it's double encoding.

>> == 2.1.4
...
>> Ideally, the fact that the datset can be determined by endpoint would be
>> first. It's the common case.
>
> I'm not sure which is the common case, but I don't really care about the
> order so I changed it around for you. :)

I wasn't clear - the common case is service provided dataset (maybe not 
in Anzo's case but overall more common as I've seen services).

I suggest moving the last para to the first.

"""
A SPARQL query is executed against an RDF Dataset. If an RDF dataset is 
not specified in either the protocol request of the SPARQL query string, 
then implementations may execute the query against an 
implementation-defined default RDF dataset. (@@ref to SD?)

The RDF dataset for a query may be specified either via the
default-graph-uri and named-graph-uri parameters in the SPARQL Protocol 
or in the SPARQL query string using the FROM and FROM NAMED keywords.

If different RDF datasets are specified in both the protocol request and 
the SPARQL query string, then the SPARQL service must execute the query 
using the RDF dataset given in the protocol request. Note that a service 
may reject a query with response code 400 if the service does not allow 
protocol clients to specify the RDF dataset.
"""


>> === 2.1.7

>> 400 is also the right code for e.g. a query with FROM when the processor
>> only accepts queries against the implicit dataset. It's a client error.
>> It's not just bad syntax.
>
> Yeah? Isn't 400 for "malformed" requests?

4xx is client error.

Supplying a datasets description to a service endpoint that does support 
a dataset description is a client error (this is SPARQL 1.0 
protocol-ness) not a server error.

400 is more just parse error. "malformed request" is the best we can do 
for request mistakes the client.

>> == 2.2.2
...
>> "as query string parameters"
>> This is update, not query.
>>
>
> Yes, but the part of the URI after the ? is still the "query string".
> It's confusing, but I'm pretty sure it's proper terminology? I've
> changed it to
>
> "as URI query parameters"

or "HTTP query string parameters".

Looking back, I see same in 2.1.2 and 2.1.3.

>> == 2.2.3
>> Update isn't like query because an update is multiple operations and
>> also updates are about state change.
>>
>> See overall comment.
>
> Separate email for this.
>
>> == 2.2.4, 2.2.5
>> See 2.1.6, 2.1.7
>>
>> Stronger text about the response to a successful update operation being
>> (by spec) empty. There is no response to a update operation defined in
>> the update language. I'm worried that leaving it implementation defined
>> might lead to expectations of query responses from an update.
>>
>> This is a spec - just don't say anything about implementation defined
>> features.
>
> The text that's there now is specifically in response to comments &
> discussion on the mailing list. As normative spec text, this text
> doesn't change anything, but it does set some expectations in a way that
> I think matches the group's intention, at least from the last discussion
> spawned from the -comments mailing list.

Wasn't that mainly about errors?  I was focusing on successful updates.

> If the group prefers that the spec says that response bodies SHOULD be
> empty, we can do that. I don't think it should say that response bodies
> MUST be empty, as that goes against what some implementations do and
> what multiple community comments have requested.

Fine - certainly can't have "MUST be empty".
I just want to slant it towards "no body"; a successful update request 
does not need a non-empty body.

"""
The response body of a successful update request is not defined in this 
specification.  An implementation may include content deemed useful, 
either to end users or to the invoking client application.
"""

 Andy
Received on Saturday, 6 August 2011 22:23:45 UTC