- From: Kendall Clark <kendall@monkeyfist.com>
- Date: Thu, 9 Dec 2004 10:56:45 -0500
- To: "Seaborne, Andy" <andy.seaborne@hp.com>
- Cc: kendall@monkeyfist.com, public-rdf-dawg@w3.org
On Thu, Dec 09, 2004 at 02:56:32PM +0000, Seaborne, Andy wrote: > Kendall, > > Good to see a new draft. Seems to me to be going in the right direction > and could be published as a first WD as-is. Thanks, Andy. I think I'll be ready to pub it as soon as I've thought carefully about this pile of comments you've sent -- and made text changes where appropriate. > == Language specification > > It would be good to be able to transport other query languages, existing and > to come. The abstract syntax for RDFGraphQuery does not contain a > specifier for > the query language; guess/parsing may be insufficient (some RDQL is legal > SPARQL > but the effect of SELECT is different). I have language in the doc about this, but ran out of steam (or just forgot) to tweak the abstract stuff and to show an example. I intend to show iTQL and Versa examples before publishing again. And, as you point out, the abstract protocol stuff needs to be tweaked to contain this bit. > I'd like to see a parameter in the abstract protocol with "lang=" in the > HTTP > binding. Would you be as happy with "query-lang" so as not to be ambiguous with lang=en/us? My preferred thing is to add a Sparql-QL-Type header (since we want to identify query languages with URIs, putting them into GET parameter just means that the URIs are (1) that much harder to read; and (2) that much longer, risking the tipping point of "too long for GET")... Sparql-QL-Type: http://www.w3.org/Submission/RDQL is preferable, IMO, to GET /foo?query=...&query-lang=http%3A//www.w3.org/Submission/RDQL (Generally, I don't see why people prefer GET params over headers, since every HTTP tool I know of or use lets me set arbitrary headers easily enough...) Okay, another issue: should there be a default in the spec? That is, if there's no query-lang bit (however it's serialized), should everyone assume that the query is SPARQL? I would prefer to set some kind of sensible default for several things, including this, but I'd like to know what others think before writing that language. > It becomes the only globally defined parameter and would free up > all > other parameter names to be specific to the query language, not predefined > by > this doc. Hmm...I'll chew on this a bit. > (I think there are slight differences in "graph=" between the > SPARQL > query and the getGraph query.) I suspect there are, but I'm curious which ones you see? > then the 3rd party form (ask a service for a named graph) of getGraph is: > > GET /qps?lang=getGraph&graph=... Hmm, I still don't understand calling getGraph a *type* of query language... Why not just specify in SPARQL that "SELECT *" means "retrieve the graph"? I'd still want a protocol operation for "retrieve graph", since that works orthogonally to any query language *type*. > while the 1st part version is still regular GET: > > GET /3.rdf > > [*] except it should be a URI, not a short name. But it won't fit them. We agree that the value of this query language type thing is a URI, yes? I'd be willing to do the work of contacting people responsible for various languages to see if they'd give us URIs to use to identify their QLs for use with our protocol. If possible, I think as many of these as possible should be enumerated in the specification. Makes client-writing *way* easier and should help bootstrap interop. (I think 4 to 8 of the cover 90% of the usage...) > This shows most readily in responses but the general point is that the > abstraction can't cover all the details of a concrete binding. Yep -- the response stuff, as you point out, is kinda muddled in the present draft. It's muddled because I was torn between doing two things: 1. overloading existing HTTP response codes for use in our slightly different domain 2. requiring responses have RDF graphs in their body (which is HTTP legal and even recommended, iirc), and letting those graphs carry the specialization information I prefer (2), but the problem with it is the WG doesn't seem especially interested in doing any vocabulary/schema work -- and there are some tricky bits. I'm very willing to work on such a solution, if anyone else is interested, though I'm not gonna hold my breath. :> > I suggest just covering the SPARQL errors, showing how they map to HTTP > response > code and leave open that other HTTP response codes will occur and the same > ones > may occur for other reasons. Yes, a good deal of work remains to be done in the response codes. I'm gonna punt on all of that till after the next publication. > For example, in HTTP, errors can be because of HTTP issues or because of > SPARQL > errors. Is there any reason not to put SPARQL error information into RDF graphs contained in the HTTP response bodies? I mean, any information other than "we don't have time"? > > As it is correct to return 404 when the service just isn't there, what > happens > in the other cases? An RDF graph in the body specializes the general response code. Or someting else...? > Minor notes on response codes: > > + Need to include 414 (Request-URI Too Long)! Hmm, how did I miss that one? Dumb. Will definitely add it since that's the signal to the client to use the alternate method for conveying the query. Good catch, Andy. > + Not sure 202 (Accepted) makes sense as query is request-response. It > certainly doesn't seem more important that some of the ones not mentioned. A sign of my hubris. An early draft suggested an asynchronous response to complex, long-running queries. But I chickened out, since that's outside our brief. So, 202 is a leftover and should be dropped. > This can be achieved efficiently in HTTP 1.1 by simply sending one request > after > another. The TCP connection is almost always open so that the overhead is > just > header parsing. I thought long and hard about how to do "sessions" -- convey in one HTTP transaction multiple queries where variables may or may not be shared across them... I think Algae does this, but I couldn't think of a clean way to do it since you only get one response code in HTTP, and that makes the response code issues you raised above even *more* complex. One thing to do is apply the HTTP response type to the req-resp transaction, and define SPARQL faults and error representations and make them the representation of a faulty query request. > Given it is possible in HTTP 1.1, I don't see the need to add another layer > that can also do multiple queries per request. I would be convinced by a > use > case as to what capability is enabled. Even with an ideal use case, it's *hard*, so I'm willing to drop it. > == HTTP issues > > Still need POST form for large queries. Just using query-uri= does not work > when firewalls are involved. As I mentioned earlier, I have notes for this and will get it into the doc ASAP. > == Misc > > What is the MINE type for N3? I found a quick survey in a IRC log which had > more > application/n3 than text/n3 but significant amounts of both. I found > text/rdf+n3 from W3C yesterday. I guessed! The N3 folks should sort this out, IMO. I try to avoid MIME fights. MIME is horribly broken, IMO (witness the compound document fiasco), and should be replaced by RDF or something useful. > == HTTP Examples > > What happens when there is no Accept: header? I prefer this to mean: > > application/xml;application/rdf+xml,q=0.9 > > so a SELECT returns XML by default. Agreed. But the interaction between SPARQL query types and con-neg should be expressed directly in a table or something, as well as in examples. Examples are too often misinterpreted. I think we had an email exchange where this all got spelled out, so I'll find and use that for a first draft. > Interactions: Do SPARQL-Distinct, SPARQL-Limit have the same meaning as in > query > language? Yes. > What about interactions with HTTP mechanisms. I suggest leaving > these > out and avoiding interaction with concrete protocol mechanisms. HTTP has headers called "Sparql-Distinct" and "Sparql-Limit"? What interaction with HTTP could there be otherwise? > SPARQL queries:: > > ex 1.2 query: > What if SPARQL-Distinct, SPARQL-Limit don't apply. Is it an error? I > suggest > ignoring them. Don't apply because not supported by the server in question? Or for some other reason? > Resources reference things not in the file - intended? > > ex 1.3 query: > What is the semantics of one query, 3 graphs? I'm not sure because I've lost track of how or whether the query language is doing queries against the merge of n graphs or how or whether we're allowed to convey one query to be exected against n graphs distinctly. I intended 1.3 to be an example of the latter. > I'd guess its three separate answers which suggests requests (and 3 response > codes) on a single connection. The second can be sent immediately, not > waiting > for the first. I thought one multipart/mime response, with each part containing the query results. > ex 1.4 query: > Same comment about using HTTP one request-one response mode. I'm not sure I'm ready to quit on this yet. Why not put the response faults into the mime parts for each query? That way in HTTP the response code applies only to the request-response cycle, and the faults, errors or successes of multiple *queries* are represented in the mime parts? The use case I'm thinking of is my cell phone as a SemWeb client. It wants to query the network for things, and it wants to do that as efficiently as possible. If no one else cares about this, we can drop it. > Can we have multiple queries against multiple graphs? N*M queries or one > query per > graph. I couldn't decide how or what to say about that. But that being tricky doesn't seem a reason to disallow the other forms per se. > GetGraph:: > > Is it the presence of a "query=" parameter that distinguishes getGraph from > a SPARQL query? A lang= would make this explicit and would. That's one way to distinguish them concretely in HTTP. My problem is that conceptually "retrieve a graph" isn't a query language type. At least, that doesn't make any sense to me. It makes sense to say that "retrieve a graph or graph(s)" is a protocol operation. > ex 2.3 multipart/related? > > RFC 2387 says: > The Multipart/Related media type is intended for compound objects > consisting of several inter-related body parts. > > I don't see them as inter-related except that they are in the data for the > same > response. A typo. I had a hellish 3 hrs trying to get Python library to generate multipart MIME bodies and just punted in the end. > I intend to have more time. Thanks for the implementation report, Andy. Very useful. I'll try to get out a new draft, responding to many of the things in this message, by late Friday my time. Thanks, again. Kendall Clark
Received on Thursday, 9 December 2004 15:58:10 UTC