- From: Gregg Reynolds <dev@mobileink.com>
- Date: Mon, 7 Feb 2011 08:35:27 -0600
- To: Chimezie Ogbuji <chimezie@gmail.com>
- Cc: public-rdf-dawg-comments@w3.org
- Message-ID: <AANLkTimnQKsJO2vhDLic+edMSmughwjJdS6efC7Se+GZ@mail.gmail.com>
On Wed, Feb 2, 2011 at 10:12 AM, Chimezie Ogbuji <chimezie@gmail.com> wrote: > Hello, Greg. Thanks for your comments, see the response(s) below (in > context). Note that I no longer have access to the email account > where I originally received the comment and so I wasn't able to > compose a reply with your original email quoted but will try to do so > by hand where necessary. > Hi Chimezie, Thanks for your response; comments below. For reference my original post is http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2011Jan/0001.html ... > On Wed, Jan 5, 2011, Gregg Reynolds wrote: > > In summary: - In my view this document is unnecessary and possibly > harmful. > > Can you elaborate how this document is harmful beyond the issues you > My text read "possibly harmful". > have raised (presumably with the intent that they be addressed by > modification to the text)? Are you > saying that the text is not salvageable given its original aim, which > is to define how the HTTP protocol can be used natively (i.e., in a > manner consistent with the constraints of > REST) to modify the graphs in a graph store? > If it only restates what is already defined elsewhere (e.g. RFC2616, RFC3986, WEBARCH, etc), then it seems likely that it will only end up adding to the confusion. I've read it pretty carefully and I believe that is the case. > > > I don't see what this document adds; or rather, I don't see what problem > it addresses that is not already addressed elsewhere. > > There is no existing specification of how HTTP methods can be used to > manage RDF graphs in a manner that takes immediate advantage of the > semantics of the underlying > protocol. Sorry, I don't understand this. What underlying protocol? What semantics? I can't tell what you're talking about. If you mean that HTTP is not sufficient to support a RESTful architecture for RDF resource services, then I respectfully but very strongly disagree. But the more fundamental problem is that this kind of language - "manage rdf graphs", "managing a graph store" (section 3) - is incompatible with REST. REST is about resources, not stores, not rdf, not graphs. The document in various places makes unmerited assumptions about resources; for example, section 4: "In this way, an HTTP request can route operations towards a named graph in an [sic] Graph Store via its Graph IRI." In my view this language is deeply unRESTful. Requests do not route; IRIs denote resources, not Graph Stores; there is no such thing as a "Graph IRI"; the kind of server process standing behind an IRI is beyond the scope of the RESTful interface. The quoted sentence seems to be saying nothing more than that IRIs denote resources, and it is up the the server to decide what exactly that means. But standard language based on the RFCs is perfectly adequate for this. Look at it from another perspective: why do we not standardize a RESTful protocol for managing JPEGs? Or Word docs? Or any other kind of data? A protocol for "managing graphs" is no different in principle than one for "managing JPEG images". We don't standardize the latter; why do the former? Or: it makes no difference if there is a massive distributed database farm or a simple filesystem behind an IRI, and it doesn't matter what format is used to store the data behind the IRI; from the client perspective, it's just a resource, for which the server dispenses a token (serialization/representation). At the very least, none of these standards docs should make reference to "Graph store", data store, data base, etc. They should stick religiously to "resource" or "graph resource". Speaking only for myself, obviously, I've gone over the draft pretty carefully and I see nothing there that is not a restatement of existing standards. The one possible exception is the definition of a query syntax - ?graph= and ?default - but I'm not even convinced these need to be standardized. Providers are free to define whatever query syntax they please for their services; that just means different IRIs. You say ?graph=, I say ?g= - no big deal. No different than controlling the path components of your IRIs. Standardizing a query syntax is no different than standardizing a path for SPARQL endpoints; I don't think anybody would advocate declaring that SPARQL endpoints must be at /sparql. Plus we don't standardize a query syntax for SQL queries or requests for JPEGS, etc. What makes RDF special? > The existing protocols (in a manner similar to SOAP > interfaces) use HTTP POST to dispatch operations where the actions > taken are defined by the content of the message rather than the > semantics of the protocol (which specifies how resources are > manipulated via the various methods: DELETE, GET, POST, etc.). Again, I don't know what you're referring to. What "existing protocols"? Can you provide a specific example illustrating failure of HTTP (including standardized extensions) to support a RESTful API for RDF resources? I can't think of one. > This > protocol is meant to address this by defining a protocol that uses the > constraints of REST to define how RDF graphs can be manipulated > directly and natively in HTTP. > I strongly oppose this. HTTP is already defined; so is REST. Anybody who wants to implement a REST architecture over HTTP to serve up RDF resources just has to do a little research to figure out how. Furthermore, whether somebody wants to use a RESTful architecture or RPC or any other design to implement RDF services is none of W3C's business. Architectural patterns are not an appropriate area for standardization in my opinion. Would anybody suggest publishing an MVC "standard"? Overstandardization is not good. > > "RDF knowledge". Please don't do this. > > After some discussion this term will not be used. A grateful world breaths a sign of relief. Thank you. ... > > In my view the source of the problem is probably the notion that RDF > somehow represents or encodes or [pick your verb]s "knowledge", which in > turn is probably > > traceable to the notion that IRIs "identify resources". Both ideas are > fundamentally wrong in my view; RDF is about graphs, not knowledge, > > The term 'RDF Graph content' (although it doesn't use the word > 'knowledge' which many found not helpful) does distinguish between the > syntax (or structure) of an RDF graph and its meaning (or content). > Huh? Not to my eye it doesn't. First off, this sentence suggests an equivalence between syntax and structure. Syntax may have structure, and graphs may have structure, but these are clearly different things. Sentences and other expressions have syntax; graphs do not. If this isn't clear, consider sets. An expression like {1, 2, 3} has syntax; the set it denotes has structure, but not syntax. Second, to be honest I don't know what "Graph content" is supposed to mean; it looks redundant to me. A graph is a graph; it has no "content". Similarly, a set is a set; it has no "content". This is not the same as saying it has no members. The point is that you cannot postulate a third entity "content" that is distinct from the set and its members. Well, you can, but it would come as a surprise to mathematics. Maybe you can clarify what is intended by distinguishing between "graph" and "graph content". To me it looks just like "3" and "3 content" or "integer" and "integer content". In other words, adding "content" obfuscates the issue. I've done quite a bit of research into various ways of looking at semantics and I can't recall every seeing anything corresponding to this usage of the term "content". Just to be clear, I do not take "graph" to refer to syntax. > This follows from the model theory of RDF, which provides a way for > RDF graphs to be interpreted and there is an understood (as with other > Not quite. It provides a way for expressions in an RDF language to be interpreted. RDF graphs are the things that are denoted. Pretty major difference. > such formal languages such as in OWL for instance) distinction between > the statements or sentences of a language and the meaning that they > convey. Whether or not this distinction is problematic for RDF as a > whole is not in scope for what this protocol intends to do and is > probably better directed at the new RDF working group, perhaps. > One problem is that the language of the document is imprecise and therefore confusing, as the above sentences illustrate. ... > > Now suppose I declare "let x be the integer 3". Will anyone think it > important to draw a distinction between "the integer and what x identifies"? > > This distinction is in fact very important to the use of model theory > to specify (in a mathematical way) how the meaning of a formal > language can be determined in order to facilitate machine > understandability. For example, the RDF model theory does indeed > distinguish between a URI reference and what it 'denotes': > I think you may have misread my example. More explicitly, if I declare "let x be the integer 3" (or "let x = 3"), I mean let the symbol x denote the same value as the symbol 3, namely the third integer. Then it is clearly absurd to draw a distinction between "the third integer" and "what the symbol 'x' identifies". I brought this up in my original post because language similar to this came up in the archives somewhere as "difference between the graph and what the IRI identifies". If we're talking denotational semantics then it must be that an IRI that identifies a graph, identifies a graph. Your language suggests that you want to make a distinction between graph and graph content. Is that correct? If so, as argued above, I don't think such a distinction is valid. Denotational semantics is a binary affair: we have notation (syntax) and denotation (semantics). Introducing a difference between graph (which is the denotatum of a graph expression) and graph content (which is ?) introduces a third element that has no basis in accepted theory as far as I know. The semantics treats all RDF names as expressions which denote. The > things denoted are called 'resources', > It is not clear from my reading your comment if your issue is with the > RDF model theory or if it is with some liberties that have been taken > in the document you are commenting on. Can you clarify? > My beef with the MT stuff is not with the theory or the MT doc (although I think it has major problems), but with the obscurity it introduces into the exposition of RDF. Model theory is arcane; 99% of the people who might be interested in RDF will have no idea what it's all about. I would much prefer a set of SPARQL standards docs that make no mention at all of MT. It just isn't necessary. > > > Compare: "in using x to refer to the integer 3 in this way, I am not > directly identifying the integer but rather the mathematical knowledge it > represents." > > You'd be laughed out of town. > > The distinction you are quoting before your statement above (between > the what the IRI of a graph in a dataset identifies and the graph > itself) is part of the SPARQL 1.0 specification (see the end of 8.2.2 > Specifying Named Graphs). Do you take issue with that part of the > SPARQL 1.0 specification or with something unique to the specification > you are commenting on? > I do take issue with that part of the SPARQL 1.0 spec, and anything else that uses this kind of language. Details below. > > > An IRI that is used to name an RDF graph refers to that graph, > > This is not the case. Please refer to the section of the SPARQL 1.0 > referred to above, which states: > > The FROM NAMED syntax suggests that the IRI identifies the > corresponding graph, but the relationship between an IRI > and a graph in an RDF dataset is indirect. The IRI identifies a > resource, and the resource is represented by a graph > (or, more precisely: by a document that serializes a graph). For > further details see [WEBARCH]. > Indeed, in my reading this passage is incoherent, or at least irredeemably vague. Also wrong. An IRI identifies a resource; if that resource happens to be a graph, then it identifies the graph. "Graph" here means mathematical object; it most definitely does not mean graph expression or syntax or representation or serialization of a resource. Frankly I'm at a loss to explain where language like this comes from. Is "graph in an RDF dataset" supposed to be special in some way? I can't see how. The basic idea behind denotational semantics is pretty simple, understandable by anybody capable of distinguishing between signifier and signified. If you have a collection of graphs and you name them, then the names, well they name the graphs. Why obfuscate such a simple concept? > > To be honest I think the problem goes back to the use of model theory to > provide semantics. > > This suggests that your issues have more to do with the (semantic) > foundations of RDF rather than this particular specification. Is this > the case and if not can you elaborate on the distinction? > Not so much the foundations as the language - I mean the language (or metalanguage) of the standards texts. It needs to be tightened up considerably, and language from MT does not help. > > I hasten to add that your text does touch on the critical issue, albeit > only in passing. That is the issue of open-world semantics. > > This issue is beyond the scope of the protocol which only attempts to > define how HTTP can be used to manipulate RDF graphs. Unfortunately, I > had some difficulty following your description of an 'open graph' or > how it is relevant for the intention of this specification. > I think one of the reasons texts about RDF tend to be hard for newcomers (I make that assumption; it was certainly my experience) is precisely that they don't make the meaning and implications of open world semantics clear and explicit. So in my view not only is it in scope, it is in a way central to the whole endeavor. A simple set-theoretic example will illustrate the point. Consider a statement like "S = {1, 2, 3}". Then the symbol 'S' denotes a set of three elements; the symbol '{1, 2, 3}' denotes the same set. Under closed semantics, we know that much about our symbols and their meanings. But we also have negative knowledge; we know, for example, that 4 is not an element of (the set denoted by) S. We know this because we get to use the Law of the Excluded Middle: 4 either is or is not a member; since we do not have positive knowledge that it is a member (no assertion), we get to infer the negative, that it is not a member. However, under open-world semantics, we are not licensed to make such inferences. If asked, "Is 4 a member of the set S?", the best we can say is "I don't know". We don't get to infer the negative (not a member) from the absence of the positive (assertion of membership), or vice-versa. Consequently, the term "denotation" is in a sense inappropriate for open-world semantics, since it must either have different meanings in open and closed semantics, or it must be unable to fully account for meaning in open-world semantics. If the symbol G denotes a graph, then what exactly does it denote? Well, we obviously have the asserted and inferred graphs, but we also have a third graph. I don't know what to call it, but it includes the asserted and inferred graphs, plus other triples that may have been asserted or inferred elsewhere as elements of the graph. Note that the distinction between asserted and inferred is not the same as the distinction between extension and intension. It might be tempting to think of an inferred graph as the intensional sense of the asserted graph; but unlike intensional senses, inferred graphs are constructable. They can be computed from the asserted graph plus a set of rules. That is not the case with intensions, which are not computable. (I can't say nobody has ever come up with a notion of intension that involves computability, but I've never seen such a beast.) Needless to say this complicates the picture of RDF semantics, and it's pretty hard to come up with clear language explaining this stuff. But I think any standard that addresses the meanings of RDF (graph) expressions should address it explicitly. Unfortunately it's quite a thorny problem; in fact I think it might be the same problem as providing a functional semantics for IO. Think of log files or database tables, whose semantics must be open, since they vary over time. With RDF by contrast, the problem is not variance over time but incomplete knowledge - this is the one place where the concept of "knowledge" (but not "RDF knowledge") is appropriate. To summarize regarding the Uniform HTTP Protocol for Managing RDF Graphs, my suggestions are - Withdraw it on grounds that it just restates existing standards and thus amounts to more of a Guide than a standard; - If it isn't withdrawn, tighten up the language to clearly and consistently distinguish between references to syntax and semantics, and eliminate language suggesting a third component to denotational semantics (e.g. eliminate the "graph" v. "graph content" distinction). Sincerely, Gregg
Received on Monday, 7 February 2011 14:36:00 UTC