Re: "RDF Knowledge" (Uniform HTTP Protocol for Managing RDF Graphs) from Chimezie Ogbuji on 2011-02-07 (public-rdf-dawg-comments@w3.org from February 2011)

From: Chimezie Ogbuji <chimezie@gmail.com>
Date: Mon, 7 Feb 2011 13:07:01 -0500
To: Gregg Reynolds <dev@mobileink.com>
Cc: public-rdf-dawg-comments@w3.org
Message-ID: <AANLkTik=Lz-kQSG9P4-qbADG+z3eP-Qpr7yZcg0u0jDT@mail.gmail.com>
Hey Gregg.  See my (personal) response below.  This thread is beyond
the point where I can determine the specifics of issues that are in
scope for this document and the WG and that can be constructively
addressed, so I imagine there will be some WG discussion and a
subsequent (more formal) response later.

On Mon, Feb 7, 2011 at 9:35 AM, Gregg Reynolds <dev@mobileink.com> wrote:
>> have raised (presumably with the intent that they be addressed by
>> modification to the text)? Are you
>> saying that the text is not salvageable given its original aim, which
>> is to define how the HTTP protocol can be used natively (i.e., in a
>> manner consistent with the constraints of
>> REST) to modify the graphs in a graph store?

> If it only restates what is already defined elsewhere (e.g. RFC2616,
> RFC3986, WEBARCH, etc),
> then it seems likely that it will only end up adding
> to the confusion.  I've read it pretty carefully and I believe that is the
> case.

None of these define how to use HTTP to manage an RDF dataset.
Consider the Atom Pub analogy.  None of these specifications say
anything about how to manage Atom feeds, collection, etc., so an Atom
Pub protocol was needed that extended them in order to specify how
HTTP can be used to manage and access Atom content in a standard
manner.

>> There is no existing specification of how HTTP methods can be used to
>> manage RDF graphs in a manner that takes immediate advantage of the
>> semantics of the underlying
>> protocol.

> Sorry, I don't understand this.  What underlying protocol?

HTTP

>  What semantics?

The meaning of the components that comprise HTTP: identifier,
resources, messages, etc.

>   I can't tell what you're talking about.  If you mean that HTTP is not
> sufficient to support a RESTful architecture for RDF resource services, then
> I respectfully but very strongly disagree.

I'm not sure how you concluded that from what I was saying, but let me
help clarify: that is *not* what I'm saying.  Again, I'm saying that
there is very little precedence that a developer who wants to expose
an RDF dataset over HTTP (for Create Read Update Delete operations)
can follow to do so.  That is what standards are for: to standardize
where there is a need and little consensus.

> But the more fundamental problem is that this kind of language - "manage rdf
> graphs", "managing a graph store" (section 3) - is incompatible with REST.
>  REST is about resources, not stores, not rdf, not graphs.

REST is about web resources, their representations, and provides a
conceptual architecture of these components and how the HTTP protocol
'manages' them.  I'm using the word manage in the normal english
sense, FYI.

This protocol discusses how the resources that RDF documents are
serializations of can be managed over HTTP (which is why there is a
distinction in the terminology between the graphs and the resources -
per your point later below).  I don't see where there is any
incompatibility

> The document in
> various places makes unmerited assumptions about resources; for example,
> section 4: "In this way, an HTTP request can route operations towards a
> named graph in an [sic] Graph Store via its Graph IRI."

This is a typographical error that was identified in a recent review
and fixed to say: "The HTTP operations defined here use URIs to route
native HTTP operations to an implementation of this protocol which
responds by making appropriate modifications to the underlying Graph
Store"

> In my view this
> language is deeply unRESTful.  Requests do not route; IRIs denote resources,
> not Graph Stores; there is no such thing as a "Graph IRI"; the kind of
> server process standing behind an IRI is beyond the scope of the RESTful
> interface.  The quoted sentence seems to be saying nothing more than that
> IRIs denote resources, and it is up the the server to decide what exactly
> that means.

You are reading too much into a typo.  Hopefully, the modification
makes that clear

> Look at it from another perspective:  why do we not standardize a RESTful
> protocol for managing JPEGs?  Or Word docs?  Or any other kind of data?

See my Atom Pub analogy above.  By that line of reasoning, the Atom
Publication protocol (or any any protocol that is conceived as an
HTTP-bound interface to managing certain kind of content) is
unnecessary.

> A protocol for "managing graphs" is no different in principle than one for
> "managing JPEG images".  We don't standardize the latter; why do the former?

Because the standards underlying the semantic web primarily support
read, but not a write interfaces, there is a need for interfaces for
updating RDF content (hence the SPARQL Update language team
submission), and there is no consensus or standardization on how this
is done in general.  In particular,there is no precedent on how this
is done in the context of the architectural style of hypermedia
applications (REST).  JPEG images are mostly involved in read
operations, they are simple (stand alone) digital content that are not
involved in a larger data structure (such as Atom entries and Atom
collections on the one hand and RDF graphs and RDF datasets on the
other).  So, that analogy doesn't make sense to me.

>  Or: it makes no difference if there is a massive distributed database farm
> or a simple filesystem behind an IRI, and it doesn't matter what format is
> used to store the data behind the IRI; from the client perspective, it's
> just a resource, for which the server dispenses a token
> (serialization/representation).  At the very least, none of these standards
> docs should make reference to "Graph store", data store, data base, etc.
>  They should stick religiously to "resource" or "graph resource".

I don't follow the need to religiously (as you say) avoid terms
outside of the architecture of the web.  The reference to a graph
store is necessary because RDF graphs (which comprise the content that
such an interface is meant to manage) are part of an RDF dataset, but
RDF datasets are not mutable.  There was a previous thread in this
list on this topic (I don't have the link off hand) that discussed
this.

> Speaking only for myself, obviously, I've gone over the draft pretty
> carefully and I see nothing there that is not a restatement of existing
> standards.

Okay.  I've asked for you to elaborate on this and I have yet to hear
(speaking for just myself) a response that is more specific beyond
this general assessment and thus constructive as input into a
conversation to consider the issues and make modifications (if the WG
thinks such modifications are necessary).

> The one possible exception is the definition of a query syntax -
> ?graph= and ?default - but I'm not even convinced these need to be
> standardized.  Providers are free to define whatever query syntax they
> please for their services; that just means different IRIs.  You say ?graph=,
> I say ?g= - no big deal.  No different than controlling the path components
> of your IRIs.  Standardizing a query syntax is no different than
> standardizing a path for SPARQL endpoints; I don't think anybody would
> advocate declaring that  SPARQL endpoints must be at /sparql.   Plus we
> don't standardize a query syntax for SQL queries or requests for JPEGS, etc.
>  What makes RDF special?
>

I've responded to this point in a separate comment of yours.

>> The existing protocols (in a manner similar to SOAP
>> interfaces) use HTTP POST to dispatch operations where the actions
>> taken are defined by the content of the message rather than the
>> semantics of the protocol (which specifies how resources are
>> manipulated via the various methods: DELETE, GET, POST, etc.).
>
> Again, I don't know what you're referring to.  What "existing protocols"?

The SPARQL Protocol for RDF (the only currently recommended protocol
for access to RDF datasets)

>  Can you provide a specific example illustrating failure of HTTP (including
> standardized extensions) to support a RESTful API for RDF resources?  I
> can't think of one.

I think this question follows from your reading too much into the typo
above.  There is nothing in *this* protocol that is saying HTTP cannot
support a RESTful API for RDF resources.  In fact, it is saying the
opposite and attempting to standardize such an interface.

>> This protocol is meant to address this by defining a protocol that uses the
>> constraints of REST to define how RDF graphs can be manipulated
>> directly and natively in HTTP.
>
> I strongly oppose this.  HTTP is already defined; so is REST.  Anybody who
> wants to implement a REST architecture over HTTP to serve up RDF resources
> just has to do a little research to figure out how.

Standards are created to bring order to the various ways that
something (for which there is substantial need) can be done.  It is
precisely because some research would need to be done and (in my
opinion) even having done this research there are multiple HTTP
interfaces that would result that you would want to standardize that
situation.

> Furthermore, whether
> somebody wants to use a RESTful architecture or RPC or any other design to
> implement RDF services is none of W3C's business.
> Architectural patterns
> are not an appropriate area for standardization in my opinion.  Would
> anybody suggest publishing an MVC "standard"?

This is not advocating either, so I don't understand your point.

> ..snip..
>> The term 'RDF Graph content' (although it doesn't use the word
>> 'knowledge' which many found not helpful) does distinguish between the
>> syntax (or structure) of an RDF graph and its meaning (or content).
>
> Huh?  Not to my eye it doesn't.  First off, this sentence suggests an
> equivalence between syntax and structure.

Actually, it is trying to distinguish between

RDF document -> 'syntax'
RDF graph ->  'abstract syntax'
RDF graph content -> thing denoted by graph URI

> Syntax may have structure, and
> graphs may have structure, but these are clearly different things.
>  Sentences and other expressions have syntax; graphs do not.  If this isn't
> clear, consider sets.  An expression like {1, 2, 3} has syntax; the set it
> denotes has structure, but not syntax.

I'm sorry, but I don't follow your analogy, your use of the terms
'syntax' and 'structure', or your reasoning behind how you have come
to assume the protocol is suggesting an equivalence between them.

> Second, to be honest I don't know
> what "Graph content" is supposed to mean; it looks redundant to me.

See the list above.

> A graph is a graph; it has no "content".

Let me try yet again with another analogy.  Consider my FOAF graph.
There is the abstract syntax that we draw with the nodes and arcs.
Then there is the syntax of an RDF document which serializes that
graph toplogy.  Then there is the concept of my social network that
the graph 'denotes'.  The latter is the 'content'.

> Similarly, a set is a set; it has no
> "content".

foaf:Person a owl:Class

foaf:Person is (loosely) a set of things (the set of people) and the
set (as used in that vocabulary) denotes the concept of Homo Sapien.
This is basic RDF / OWL MT and underlies how RDF is meant to be a
machine understandable knowledge representation.  It seems we have
some disconnect between us on a very rudimentary level.

> This is not the same as saying it has no members.  The point is
> that you cannot postulate a third entity "content" that is distinct from the
> set and its members.  Well, you can, but it would come as a surprise to
> mathematics.

Not really.  In mathematical logic there are constants that denote
'things', sets of constants that denote categories of 'things'.  RDF
is similar, except the constants are the URIs and the things denoted
are the 'resources'.   The term 'RDF graph content' is using the same
rudimentary, mathematical logic mechanism but at the level of a graph
(and note that it is *not* doing it in any formal way but just as a
way to reconcile the protocol with how REST conceives of resources,
representations, identifiers, etc.)

> Maybe you can clarify what is intended by distinguishing between "graph" and
> "graph content".

I just did above and the protocol you are commenting includes text
that does the same:

GET /rdf-graphs/employees HTTP/1.1
Host: example.com
Accept: application/rdf+xml

[[[
[...] we are not directly identifying an RDF graph [with
http://example.com/rdf-graphs/employees] but rather the RDF graph
content that is represented by an RDF document, which serializes that
graph. Intuitively, the set of interpetations that satisfy [RDF-MT]
the RDF graph serialized by the RDF document can be thought of as this
RDF graph content.
]]

Consider two levels, first at the level of statements within an RDF graph

constant | what it denotes (or identifies)
----------------------------------------------------------
URI        | resource
BNode    | resource
Literal     | typed, lexical string, number, etc.

At the level of a graph

constant    | what it denotes (or identifies)
-------------------------------------
graph URI  | 'RDF graph content'

If the URL http://copia.ogbuji.net/foaf/me.rdf is the URL of a graph
describing my social network then that URL does not denote the RDF
graph but the concept of my social network.

>> This follows from the model theory of RDF, which provides a way for
>> RDF graphs to be interpreted and there is an understood (as with other
>
> Not quite.  It provides a way for expressions in an RDF language to be
> interpreted.  RDF graphs are the things that are denoted.

I don't understand what you are trying to say here.

> I think you may have misread my example.  More explicitly, if I declare "let
> x be the integer 3" (or "let x = 3"), I mean let the symbol x denote the
> same value as the symbol 3, namely the third integer.  Then it is clearly
> absurd to draw a distinction between "the third integer" and "what the
> symbol 'x' identifies".  I brought this up in my original post because
> language similar to this came up in the archives somewhere as "difference
> between the graph and what the IRI identifies".  If we're talking
> denotational semantics then it must be that an IRI that identifies a graph,
> identifies a graph.  Your language suggests that you want to make a
> distinction between graph and graph content.  Is that correct?

Yes, (and again) this distinction is not introduced by the draft you
are commenting on but by SPARQL 1.0.  It is only attempting to clarify
that extant distinction and reconcile it with the REST style.

> If so, as
> argued above, I don't think such a distinction is valid.

Then your issue is with SPARQL 1.0.  Thankfully, you clarified this:

> ..snip..
> I do take issue with that part of the SPARQL 1.0 spec, and anything else
> that uses this kind of language.  Details below.

> Indeed, in my reading this passage is incoherent, or at least irredeemably
> vague.  Also wrong.

Ok.

> An IRI identifies a resource; if that resource happens
> to be a graph, then it identifies the graph.

This seems like circular logic to me.

>   "Graph" here means
> mathematical object; it most definitely does not mean graph expression or
> syntax or representation or serialization of a resource.

I cant make sense  of why you think this distinction between what the
graph URI identifies and the graph it is paired with in the RDF
dataset is (as you put it) 'vague' and 'incoherent'.  If you think
this (fundamental) understanding of the relationship between a graph,
its URI, and what the URI identifies is problematic then I suggest you
separate  this substantive complaint / issue out from your comment on
this particular draft (which is not the source of the distinction).

> ..snip..
> I think one of the reasons texts about RDF tend to be hard for newcomers (I
> make that assumption; it was certainly my experience) is precisely that they
> don't make the meaning and implications of open world semantics clear and
> explicit.  So in my view not only is it in scope, it is in a way central to
> the whole endeavor.

I've read the expounded section that follows and still cannot follow
your arguments about open world semantics, what they have to do with
graph identifiers, or why you think this is relevant for the document
you are commenting on.


> .. snip ..
> Withdraw it on grounds that it just restates existing standards and thus
> amounts to more of a Guide than a standard;
> If it isn't withdrawn, tighten up the language to clearly and consistently
> distinguish between references to syntax and semantics, and eliminate
> language suggesting a third component to denotational semantics (e.g.
> eliminate the "graph" v. "graph content" distinction).

I'll leave  to the (formal) response.

-- Chime
Received on Monday, 7 February 2011 18:07:55 UTC