Re: Review of "SPARQL 1.1 RDF Dataset HTTP Protocol"

(I mistakenly sent this to Andy alone)
Hey Andy, thanks for the thorough review.  See my response below.

On Fri, Feb 4, 2011 at 4:05 AM, Andy Seaborne
<andy.seaborne@epimorphics.com> wrote:
>  1/ fixing the <h1>

I have fixed this.

>  2/ removing the red box (all issues must be addressed for LC)
>
> Other than that it is publishable as-is for LC - everything is covered so
> that it meets the LC criteria of "no outstanding issues".
>
> Ideally, at least the major points should be addressed. Of those, the most
> useful is a suggestion to add some examples of the protocol in action into
> section 3 (Protocol model), which also helps address the KK point about
> needs of implementers.
>
> Also, it would be nice to have an example of the protocol in each operation
> description, before the equivalent SPARQL Update.
>

> .. snip ..
> == General Points:
> [**] service or a graph store
> In
> http://example/x?graph=...
> is <http://example/x> a service or a graph store?

It is a service.  I recently added text to the second to last
paragraph in 4.2 to clarify this:

"http://www.example.com/rdf-graphs/services identifies the HTTP
listener that implements this protocol."

> In the final section, it is very definitely a service.  But (e.g.) section 3
> does not mention service and says "route native HTTP operations to a Graph
> Store" which, in the absensce of anything else, implies to me it's naming
> the graph store.

I have changed that sentence to: "The HTTP operations defined here use
URIs to route native HTTP operations to an implementation of this
protocol which responds by making appropriate modifications to the
underlying Graph Store".

Note, the text says *the* underlying Graph Store and not *an*
underlying Graph Store.  There is an unstated assumption that a
service manages a particular Graph Store otherwise you have the
problem of determining which Graph Store is affected by the HTTP
operations.  Is this consistent with how SPARQL 1.1 Update Services
are defined?  This is directly related to Leigh's questions in his
most recent comment:

[[[
..  However I'm not sure there's necessarily a 1:1 mapping between a
GraphStore and SPARQL endpoint [...] The service description document
indicates that a SPARQL endpoint may expose several datasets, so how
does a client identify the correct Service or Dataset in the
description.
]]] - http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2011Jan/0028.html

> [**]
> Is a Compliance section required by W3C process?  especially as we will have
> tests.

Perhaps Ivan and/or Sandro can answer this question but I recall that
this question came up in the GRDDL WG and in the end the
recommendation did not have a Compliance section despite having test
cases.

> == Title
>
> [***] The <h1> is wrong (the <title>is right).

I have changed this.

> == Section: Abstract.
>
> [*] Don't define the doc in terms of other documents.  Simply state what it
> is.
>
> Drop the first 2 sentences.
>
> "The protocol described here is meant to provide a minimal set of uniform,
> ..."
> =>
> "This document describes the use of HTTP verbs applied to managing a
> collection of graphs in the REST architectural style."

I changed that sentence to "This document describes the use of HTTP
operations for the purpose of managing a collection of graphs in the
REST architectural style."

> == Section 1
>
> [] When seeing a list like this I expect, somewhere, to see either text
> later tying back to it (e.g. conclusion) or expansion of the points to say
> how the doc meets the claims.

I have expanded the paragraph that follows with a couple of sentences
describing how the constraints are met.

> [] "It emphasises the distinction between" => "It emphasises the following
> distinct concepts"

This has been changed.

> [*] "*MUST* ....and the words appear as emphasized text, "
>
> The document does not use them in this way. They are not in bold later, and
> bold is reserved for HTTP Verbs.
> SmallCaps might be useful to emphasize.

I switched to italics rather than bold.

> [*] It's "SPARQL 1.1 Query", not "SPARQL 1.1/Query" and similarly for other
> docs.
> This occurs in other places as well (e.g. sec 8)

This has been changed.

> == Section 2:
> [*] "Resolvable URI": potentially has a representation.
> Might only be able to PUT/POST to it only so no representation.

This has been changed to "A URI whose resource potentially has one or
more representations available ..."

> [] Serialize (verb)
> Does not work for me.
>
> A document is a serialization of a graph (noun)
>
> An alternative use of the verb Serialize is the action of doing it, and
> that's done by the application code/CPU.
>
> "A document serializes graph." does not seem right to me if read as the
> document is carrying out the action.  In fact, the document avoids such a
> phrase and says:
>
> "document ... which serializes ... graph."
> "document ... that serializes ... graph."
> which isn't a simple subject-verb-object sentence.
> and "graph serialized by document" occurs as well.

Ok.  I added this term in response to Ian's comment about being
unfamiliar with using serialization and representation
interchangeably.  So, I was trying to spell out the use of the word
serialization in relation to a graph and an RDF document.  In my
opinion, there is enough precedent in using this term in the sense
that I meant both from the RDF/XML specification ("An RDF Document is
a serialization of an RDF Graph into a concrete syntax.") and the
original SPARQL specification ("the resource is represented by a [..]
a document that serializes a graph").  However, there seems to be a
significant amount of confusion regarding the idea that a graph URI
identifies a resource that is represented by a document that
serializes the corresponding graph (despite the fact that this idea is
part of the SPARQL 1.0 specification).

So, I will remove this term definition, however, I wouldn't be
surprised if this continues to remain a point of confusion.

> [**] Graph Store
> It's not necessarily a single service.
> e.g. SPARQL Update and SPARQL HTTP Dataset Management services on one graph
> store.

I have changed this to "A mutable repository of RDF graphs managed by
one or more services"

> [**] RDF Graph Content: "named graph" -> what about default graphs?

Default graphs don't have an associated graph URI, so there is no
means to identify their graph content.  I.e. in

GET http://example.com/service?graph=http://example.com/graphs/1

the request URI identifies the graph content of a named graph, however
if you extend the definition to include default graphs, then

GET http://example.com/service?default

The request URI identifies the graph content associated with a graph
that does not have a name, which is a bit odd.  However, I've changed
that definition to: "An information resource identified by the graph
IRI of a named graph and managed by a server that implements this
protocol or identified by an indirect operation on the default graph."

> [] "Semantic" - does not add anything - omit.

I have removed this definition.

> == Section 3 Protocol Model
>
> [**] Some text and some examples here of the protocol in action would
> address the KK comment about speaking to implementers

But there are examples of the protocol in action in later sections.
Are there specific kinds of examples you had in mind that are
perhaps not covered by the later examples?  This section just briefly
introduces the protocol model and the subsequent sections go into more
detail and provide examples.

> [**] "compliant"
> There is no compliance section in the document.

See my question above about whether compliance sections are required
for all W3C REC track documents.

> Section 4:
> [] "(intuitively)"
> Remove - does not add anything, but breaks the flow.

This has been removed.

> [*] "In this case " => "In this case, " (comma)
> to match style elsewhere in doc.

This has been changed.

> [*] "An HTTP request can route"
>
> Isn't it the server that does the routing of a request?
> At least "can be routed"

This has been changed to "In this way, the server would route
operations towards a named graph in an Graph Store via its Graph IRI."

> [*] "attempt to emphasise" => "The diagram below shows this ..." or even
> straight to
> "The diagram illustrates ..."

This has been changed

> [*] "most basic" - it's basic or it isn't.

This has been changed

> == Section 4.2
> [] First sentence is 4.2 lines long in my print out.  Reword.

I broke out the points in the sentence into a proper list.

> [**] "HTTP listener"
> Not sure about this. Is it an official term?  I could not find it with
> Google.

Changed to "HTTP service"

> [*]
> "(and default)" - remove () - "named or default graph"

This has been changes.

> [*] "is a a representation" => is a representation" (doubled "a")

This has been changes.

> == Section 5:
>
> [] "Where an equiv SPARQL Update op is given"
> It always is, except for HEAD.

That lead in sentence has been changed to "In places where an
equivalent SPARQL Update operation is given.."

> [*] Suggestion: Add example the HTTP operation before the SPARQL Update
> equivalent.  Speaks to KK comment about implementers.

In every case where there is a SPARQL Update equivalent, I have added
an HTTP example before hand.

> [*]
> "For requests that use HTTP verbs not listed here"
> Just don't say anything and leave to HTTP.
> Maybe some webdav verbs are meaningful. An extended implementation should
> still be compliant.
> And it conflicts in spirit of "additional details of appropriate behavior
> beyond those specified here"

That sentence has been removed.

> [] Make "Status Codes" discussion a section on it's own - it's worth a
> section header

Done.

> == Section 5.1 PUT
> [*] "be be" => "be"

Fixed.

> [**] "Whether the DROP is necessary"
> if empty graphs are recorded => content exists (the empty RDF graph).
> Suggest "DROP is needed to remove any previous RDF content"
> [**] Maybe a general discussion of empty graph in Sec 5 start.  Section on
> status codes, section on empty graphs.

I've changed that sentence to : "DROP is needed to remove any previous
RDF graph content. Developers should refer to [SPARQL-UPDATE] for the
specifics of how to handle empty graphs"

> == 5.5 POST
> [**] "incorporate" reads to me as an open process "RDF merge" is the
> specific operation.

This has been updated.

> [*] 201 Created.
> This is a different style from other status code which is (plain)"NNN
> (Common Text)" (I prefer the style here of <code>200 OK</code> FWIW)
> And later in sec 7 it has "401 status code (Unauthorized)

I have changed all the status codes to use the same <code> style

> [***] Red Box
> Address, remove or anything to get rid of redness by LC.
> Else it's not a LC.
> [*] "alternative behavior" of POST
> To me it's not "alternative", (what's it an alternative to?)
> It's a specific behaviour for something that REST leaves open to definition.

I've removed this box and incorporated the text.

> [] Text after red box.
> Make first part specific to POST

Fixed.

> [] "that serializes" see above.

Fixed.

> == Sec 5.4 HEAD
>
> [**] HEAD is metadata about the response and representation, not the RDF
> content.
>
> Terminology has "RDF graph content", and that's what the IRI names, not the
> representation.

RFC 2616 defines the HEAD method as "identical to GET except that the
server MUST NOT return a message-body in the response."  So, the
request does identify the RDF graph content.  I replaced the first
sentence with the one above:

"When used in this protocol, the HTTP HEAD method is identical to GET
except that the server MUST NOT return a message-body in the response.
It is meant to be used for testing [...]"

> [] "As described" - what? where?
> ? here or HTTP RFC2616?

This was removed.

> == Sec 5.5 PATCH
>
> [*] "doesn't modify a single graph"
> has two meanings
> 1 - request mentions only one graph or 2 - makes changes to only one graph,
> but attempts to change many with no visible effect.

I've changed that to ".. a PATCH request that makes changes to more
than one graph or the graph it modifies is not the one indicated"

> What about modifies no graph?
> e.g.
> DROP SILENT <doesnotexists1> ;
> does not modify(=change) a single (i.e. any) graph.

This should not be a valid (but redundant) PATCH operation

> Ditto INSERT DATA of
> already present triples.

So (with the changed text), this should not be a problem as long as
the INSERT DATA only modifies a single graph and it is the
same graph identified in the request URI.

> == Section 6:
> [*] I'm not sure what this section is adding. It seems to say "do the right
> HTTP thing" which would be better places at the front (sec 3 maybe) as a
> principle.  Then the section can be dropped.

I moved the following sentence to the end of section 5: "For example,
conditional requests that make use of headers such as
   If-Modified-Since that are intended to reduce unnecessary network
usage should be handled appropriately by implementations of this
protocol per [RFC2616]." So, now 5 Graph Management Operations serves
as an overall summary of implementation behavior and Section 6 has
been dropped.

> == Section 7:
> [*] First sentence:
> This should be about security issues, not that it should use security
> features. The security section is a requirement of REC docs, I think, and
> has some proforma requirements.

This section is meant to do both: discuss security issues ( DOS &
Injection attacks, etc. ) and specify that implementations should make
use of HTTP security features.  So far, there are no security issues I
know of that are specific to this protocol rather than to any HTTP
application in general.

> == Section 8:
> Interesting (to me) but unnecessary in the spec and a hostage to fortune.
> The doc works without it.  If its necessary, then it should be at the start
> anyway.
> [] OPTIONS only mentioned here

I'd prefer it comes after discussion of the relevant behavior of the
HTTP verbs in order to give proper context.  To this end, I have moved
the first part having to do with GET and http-range-14 to an
informative subsection at the end of 5.5 HTTP GET.  I moved the second
part related to the retrieval of service description documents to a
section 5.8 (since it involves another HTTP verb).

> == Section 9;
> [*]
>  [SPARQL] is used in sec 4.1 but not mentioned as a reference.

I've added the reference

> [*] Surely [SPARQL-UPDATE] is normative because it provide the meaning of
> PUT etc. specifically on a graph store.

This has been moved.

> [] Can we have links from the [RFC3986] etc to the reference entry please?

All such references  have been linked to the references section.

-- Chime

Received on Sunday, 6 February 2011 04:58:27 UTC