Re: Review of Pierre-Antoine's review of SPARQL 1.1 Graph Store HTTP Protocol from Antoine Zimmermann on 2011-08-30 (public-rdf-wg@w3.org from August 2011)

From: Antoine Zimmermann <antoine.zimmermann@gmail.com>
Date: Tue, 30 Aug 2011 19:31:52 +0200
To: antoine.zimmermann@insa-lyon.fr
CC: public-rdf-wg <public-rdf-wg@w3.org>
Message-ID: <4E5D1E88.9000209@insa-lyon.fr>
I forgot to mention that this completes ACTION-68: Review Pierre-A's 
comments on SPARQL graph store upade protocol that was looooong overdue.



Le 30/08/2011 18:15, Antoine Zimmermann a écrit :
> Overall, I agree with the comments. However, the review is written like
> a personal review, not as a group review.
> Besides, it would be good to put this review on the wiki. It would make
> minor corrections (typos) easier and would help making it a "community
> review". I provide detailed comments below.
>
>
>  > * General remark
>  >
>  > Several problems I had in reading this document come from the
>  > notion of identification, and the facts that:
>  >
>  > + graph IRIs in a graph store actually do not identify a graph
>  > (imutable abstraction), but rather a "slot" (the term comes from
>  > the SPARQL update document) or an "RDF graph content" in the
>  > terminology introduced by that document.
>  >
>  > + several notions of identifications are considered in different
>  > parts of the document without making the distinction explicit,
>  > while those notions do not always match:
>  >
>  > 1. identification in the graph store
>  > 2. identification in the HTTP protocol
>  > 3. identification/naming in the RDF semantics
>  >
>  > This document only refers to 3. once, and in my opinion should not
>  > (see below my remarks on 4.1).
>  >
>  > It refers implicitly to 1. and 2. in many places; sometimes the
>  > distinction between the two is not relevant, sometimes the context
>  > is enough to decide. Sometimes, however, the distinction should me
>  > made explicit, as suggested in my remarks below.
>
> should *be* made
>
>  >
>  >
>  > * 2. Terminology
>  > ** Resource: "a network-accessible data object"
>  >
>  > I know this definition comes from [RFC2616], but RDF uses
>  > "Resource" in a much broader sense...
>  >
>  > ** RDF document: "a serialization of an RDF Graph"
>  >
>  > The term "document" usually refers to a mutable, "living" thing,
>  > so a better term should probably be used here. Richard Cyganiak
>  > will make a proposal on behalf of the RDF-WG.
>
> We can directly put the suggested term.
>
>  >
>  > ** Graph Store: "managed by one or more sevices [SPARQL-UPDATE]"
>  >
>  > the definition seems to contradict the one in [SPARQL-UPDATE]
>  > ("one or more" v.s. "a single") but still references it. Strange...
>  >
>  > ** RDF graph content
>  >
>  > the definition is strange: "an information identified by the URI
>  > of named graph"; then it should be named graph ?!...
>
> A named graph is a pair <name,graph>, while the information "identified"
> by the IRI is a single thing.
>
> Moreover, "an information identified" should be replaced by "an
> information resource identified".
>
>  >
>  > besides, what does "identified by an indirect *operation*" mean ?
>  >
>  > a better idea seems to define RDF graph contents as the
>  > components" of a graph store (or "slots", as the SPARQL UPDATE
>  > document calls them?)
>  >
>  > * 4.1. Direct Graph Identification
>  >
>  > This section is a bit disturbing...
>  >
>  > In the first paragraph, "resource" and "graph" seem to be used as
>  > synonyms without it being explicit.
>  >
>  > In the third pargtaph, "... the most common usage of a Resource-URI
>  > is to identify a resource". Is there *any* other usage??
>
> third *paragraph*
>
>  >
>  > Then we read in the next paragraph that "we are not directly
>  > identifying an RDF graph". Why then is the section entitled "Direct
>  > Graph Identification"? The problem here is again about what Graph
>  > IRIs really identify.
>  >
>  > Then we read: "Intuitively, the set of interpetations that satisfy
>  > [RDF-MT] the RDF graph that the RDF document is a serialization of
>  > can be thought of as this RDF graph content." It is really not
>  > "intuitive" for me that a set of interpretations can be thought of
>  > as an RDF graph content (an information resource). This sentence
>  > tends to muddy the waters about the definition of RDF graph content.
>
> the set of *interpretations*
>
>  >
>  > I would rather say that the RDF graph is the current *state* of the
>  > RDF graph content (see g-box and g-snap in
>  > http://www.w3.org/2011/rdf-wg/wiki/Graph_Terminology).
>
> I'm not sure if I agree but this just show that the notion of RDF graph
> content is not clear enough.
>
>  >
>  > This remark also applies to Figure 1.
>  >
>  > * 5.2 HTTP PUT
>  >
>  > + "[the URI] identifies the RDF payload". The RDF payload is an
>  > entity, not a a resource; it is *not* identified by a URI.
>
> "not a a resource"
>
> I do not understand what you mean. "entity", "resource", what are these
> things? Do you mean RDF Resource? "Resource" as in REST? Everything can
> be identified by a URI and everything is an RDF Resource.
>
>  >
>  > + In section 5.2. HTTP PUT, it is not clear whether the service is
>  > allowed to alter the RDF payload before storing it, which is
>  > common practice in the REST world.
>  >
>  > * 5.3 HTTP DELETE "overriden"
>  >
>  > The term "overridden" is used twice but never defined. It is not
>  > clear from the context what it means exactly.
>  >
>  > * 5.5.1 Ambiguity Regarding the Range of HTTP GET
>  >
>  > I find this section a little confusing; if I get it right, I would
>  > suggest to keep the first half of the first paragraph
>  > ("Historically"..."the response code returned.") followed something
>  > like:
>
> followed *by* something
>
>  >
>  > This protocols suggests that graph IRIs that are under the control
>  > of the service owner return a status code 200 OK, with an RDF
>  > payload serializing the RDF graph content identified by that graph
>  > IRI in the graph store. This amounts to aligning the
>  > identification relation (between the graph IRI and the graph
>  > content resource) in the graph store to the identification
>  > relation in the HTTP protocol, and this is consistent with the
>  > recommendations in [WEBARCH] as RDF graph contents are indeed
>  > information resources.
>  >
>  > This protocols also propose, in section 4.2, a way to build a
>
> This protocol also proposes
>
>  > dereferenceable URI from any graph IRI in the graph store, even
>  > those not under the control of the service owner or those that can
>  > not be made dereferenceable. Those new dereferenceable URIs can
>  > therefore be considered to identify, in the HTTP protocol, the
>  > corresponding RDF graph content in the graph store. In that case,
>  > the identification relation in the graph store is different from
>  > the identification relation in the HTTP protocol.
>  >
>  >
>  > * Typos and other minor comments
>  > + Last paragraph of section 2 has two main verbs in a single
>  > sentence.
>
> and the sentence says "MUST interpet" (sic)
>
>  > + Parenthesis in the 1st paragraph of section 4.1 is grammatically
>  > inconsistent with the text ("by")
>  > + Figures can not be understood when printed in b&w, which is not
>  > good
>  > + Paragraph just before 5.5.1 is missing a word between "a" and
>  > "SHOULD"
>  > + Secrion 5.5.1 : recieve ? receive
>
> *Section* 5.5.1
>
>  > + Section 5.6 seems to be redundant with the HTTP RFC. If so, it
>  > should clearly refer to the RFC and be marked as informative, as
>  > it does not define anything new.
>  > + Section 5.7: a graph content can not be used as an RDF payload, as
>  > it is not an RDF *document*.
>  > + Section 5.7: a SPARQL UPDATE query can not be used as an RDF
>  > payload, as it is not RDF.
>
>
> Best,
Received on Tuesday, 30 August 2011 17:33:22 UTC