Re: protocol 1.1 review (to 2.3) from Lee Feigenbaum on 2011-08-06 (public-rdf-dawg@w3.org from July to September 2011)

From: Lee Feigenbaum <lee@thefigtrees.net>
Date: Sat, 06 Aug 2011 16:48:34 -0400
To: Andy Seaborne <andy.seaborne@epimorphics.com>
CC: SPARQL Working Group <public-rdf-dawg@w3.org>
Message-ID: <4E3DA8A2.5010404@thefigtrees.net>
Thanks very very much, Andy. (The changes mentioned below should be 
reflected in CVS.)

On 8/5/2011 3:55 AM, Andy Seaborne wrote:
> This review only covers upto end of section 2.3.
>
> Overall comments:
>
> 1/ Structure
> I suggest having a formal definition section and an examples section.
> Promote 2.4 (examples) to be new-3.
>
> This leads 2 as the formal definition section, rather than 90% of the
> document.

I like this & will do it.

> 2/ The use of a dataset description for update seems strange.
> I've split discussion into a separate message.

I'll reply to that separately, thanks.

> 3/ HTTP requests controlled by content type MUST set the content type;
> not SHOULD. This occurs in all POST places.

Agreed & have changed.

> == Introduction
> Need to mention the different SPARQL 1.1 Graph Store HTTP Protocol.
>
> Add para:
> """
> A separate document describes the SPARQL 1.1 HTTP Graph Store Protocol.
> [links] for accessing and managing a collection of graphs in the REST
> architectural style.
> """

I've taken this for now but might delegate to the overview document 
eventually.

> == section 1.1:
> Delete prefixes table.

Done.

> == section 1.2:
> "service" is the abstraction; "endpoint" is the concrete place to talk
> to a service. A service may have several endpoints.

I agree with this.

> Not sure we ned to define "SPARQL Protocol service" but if we do:
>
> """
> SPARQL Service
> A service that offers this protocol.
> """

I think the current definition is a bit wonky because it equates a 
service with an HTTP server, but I think the definition needs to remain. 
It's used throughout the documents (in addition to the part about error 
codes that you reference, it's used to talk about datasets, base IRIs, 
and in the examples). It's not always strictly called "SPARQL protocol 
service" , but the term service is used to mean "the thing that executes 
the query that this protocol lets you communicate with."

Maybe your definition is good enough... need to think about it a bit more.

> "protocol" does not add anything (are there protocol-less services?)
>
> The only real use I found was in 2.1.6 etc to talk about errors code.
> The plain word "service" or "SPARQL service" worked for me there.
>
> """
> SPARQL endpoint
> The URI at which a SPARQL service listens for requests
> from SPARQL Protocol clients.
> """
>
> "The" -> "A"
>
> == Section 2:
> General:
> There is use of "MUST" and "encode" that are talking about HTTP.
> This doc is not defining that requirement, it comes from HTTP.
>
> Many people will be using HTTP through a library that may well handle
> many of the issues.
>
> Suggest only using MUST for things SPARQL defines.

It's not clear to me what to reference to include these bits 
normatively. The must requirements also tend to go a bit beyond what 
HTTP might mandate, such as specifying the names of the parameters.


>
> "defined ... conformant"
>
> If conformant refers to this document, it seems superfluous as "define"
> does nor define non-conformant.
> If conformant refers to HTTP, I'm not sure what the point being made is.
>
> s/conformant//

Agreed & done.

> == 2.1
>
> Add CSV-TSV to list of result types.
> "SPARQL 1.1 Query Results JSON Format"
> "SPARQL 1.1 Query Results CSV and TSV Formats"

I've done this, but not sure if it matters whether or not CSV/TSV ends 
up as rec track.

> == 2.1.1
>
> "client MUST URL encode"
> -->
> Not MUST as it's an HTTP requirement.

The MUST refers to the full rest of the sentence, not just this phrase.

> "The HTTP operation will need to be encoded according to the rules of
> RFC 2161."

I'm still not sure if this is right. Maybe I just am doing bad searches, 
but I can't find talk in http://tools.ietf.org/html/rfc2616 of how to 
build a request URI query string from a set of key/value pairs.

> The examples will show this.
>
> "Query string parameters MUST be"
> A library may well be sorting this out.

That's ok, but irrelevant to our spec, right?

> == 2.1.2
>
> There are two sub-cases: HTML form encoding and POST of a query string.
> I think you can mix as well, although it's rare.
>
> The section is actually about HTML form encoding, but says "URL-encoding".

I don't understand what you mean here. Forms use URL encoding.

> No need for a mini-tutorial on how to build an HTML form - reference
> HTML4 (http://www.w3.org/TR/html4/interact/forms.html) for the normative
> text, show in examples.

Again, I'm not sure this works. 
http://www.w3.org/TR/html4/interact/forms.html#h-17.13.4.1 appears to be 
the closest we get, but it talks about controls and documents, which do 
not apply in the case of the SPARQL 1.1 Protocol.

My feeling is that this bit is short and simple enough to be better off 
as self-contained, especially in lieu of anything clearly better.

> == 2.1.3
>
> Client MUST include content-type "application/sparql-query" (not SHOULD)

Got this everywhere.

> Ought to note that the default (and only) valid charset encoding is UTF-8.

OK. This is a TODO for me. @@

> == 2.1.4
>
> s/theFROM/the FROM/

Got it.

> Ideally, the fact that the datset can be determined by endpoint would be
> first. It's the common case.

I'm not sure which is the common case, but I don't really care about the 
order so I changed it around for you. :)

> Might be good to stress that a processor is not required to support
> dataset descriptions:
>
> """A service MAY reject a query, using response code 400, if the service
> only accepts queries without a dataset description in the protocol or
> query itself
> """

Got it.

> == 2.1.5, 2.1.6
>
> Some general statement to the effect of "do the right thing by HTTP"
>
> e.g. somewhere
>
> """
> The SPARQL 1.1 Protocol is built on top of HTTP. All HTTP requirements
> for requests and responses MUST be followed.
> """

I like this and have included it at the start of section 2.

> Add CSV TSV
>
> Add Turtle (link to submission :-() and N-triples.

ok, TODO.

> === 2.1.7
>
> "should be" unemphasised.

Got it.

> This was tricky last time. 403 is a reasonable response as well.
>
> Maybe
>
> "The HTTP response code for an unsuccessful query operation should be:"
> ==>
> "The HTTP response codes applicable to an unsuccessful query operation
> include:"

Got it.

> 400 is also the right code for e.g. a query with FROM when the processor
> only accepts queries against the implicit dataset. It's a client error.
> It's not just bad syntax.

Yeah? Isn't 400 for "malformed" requests?

> == 2.2 Update
>
> """
> The update operation is used to send a SPARQL update
> request to a service and receive the results of the request.
> """
> There are no results. Delete from "and".

Well, the results are success or failure, but your point is well taken 
and the text removed.

> Why do we have default-graph-uri and named-graph-uri?
> What do they do?

This is addressed in your other message, and I'll reply there.

> ** ==> USING
> This is new.

Eh?

> == 2.2.1
> See 2.1.2
> Content header MUST be ....
>
> == 2.2.2
> MUST set the content type.

Right, got that everywhere.

> "as query string parameters"
> This is update, not query.
>

Yes, but the part of the URI after the ? is still the "query string". 
It's confusing, but I'm pretty sure it's proper terminology? I've 
changed it to

"as URI query parameters"

> == 2.2.3
> Update isn't like query because an update is multiple operations and
> also updates are about state change.
>
> See overall comment.

Separate email for this.

> == 2.2.4, 2.2.5
> See 2.1.6, 2.1.7
>
> Stronger text about the response to a successful update operation being
> (by spec) empty. There is no response to a update operation defined in
> the update language. I'm worried that leaving it implementation defined
> might lead to expectations of query responses from an update.
>
> This is a spec - just don't say anything about implementation defined
> features.

The text that's there now is specifically in response to comments & 
discussion on the mailing list. As normative spec text, this text 
doesn't change anything, but it does set some expectations in a way that 
I think matches the group's intention, at least from the last discussion 
spawned from the -comments mailing list.

If the group prefers that the spec says that response bodies SHOULD be 
empty, we can do that. I don't think it should say that response bodies 
MUST be empty, as that goes against what some implementations do and 
what multiple community comments have requested.

> == 2.3
>
> This section does not seem to come to a conclusion. Could if have a list
> of where the base URI comes from in SPARQL terms?
>
> So we have
>
> 1/ current BASE in query or update (5.1.1, 5.1.2 depending on how you
> look at it)
> 2/ Service endpoint (5.1.3)
>
> ---------
>
> I think 5.1.3 does apply - it's the request URI. "retrieve" is used
> because the RFC text is slanted towards GET and also POST into
> containers. Then 5.1.4 never applies - it covers non-protocol cases.
>
> (The request URI is often unhelpful because it includes the query string
> but that is what the std says unfortunately.)

2.3 as it stands now is unchanged from SPARQL 1.0 protocol. I don't 
really mind considering updating it to be clearer, but I'd like to 
finish the rest of the work before returning to this.

thanks again, Andy. Please feel free to snip out any points of agreement 
in any responses.

Lee

>
> Andy
>
>
Received on Saturday, 6 August 2011 20:49:26 UTC