- From: Seaborne, Andy <andy.seaborne@hp.com>
- Date: Thu, 09 Dec 2004 14:56:32 +0000
- To: kendall@monkeyfist.com
- CC: public-rdf-dawg@w3.org
Kendall, Good to see a new draft. Seems to me to be going in the right direction and could be published as a first WD as-is. Andy == Language specification It would be good to be able to transport other query languages, existing and to come. The abstract syntax for RDFGraphQuery does not contain a specifier for the query language; guess/parsing may be insufficient (some RDQL is legal SPARQL but the effect of SELECT is different). I'd like to see a parameter in the abstract protocol with "lang=" in the HTTP binding. It becomes the only globally defined parameter and would free up all other parameter names to be specific to the query language, not predefined by this doc. (I think there are slight differences in "graph=" between the SPARQL query and the getGraph query.) With a lang= parameter, then the requests would look like [*] this: GET /qps?lang=sparql&graph=...&query=... then the 3rd party form (ask a service for a named graph) of getGraph is: GET /qps?lang=getGraph&graph=... while the 1st part version is still regular GET: GET /3.rdf [*] except it should be a URI, not a short name. But it won't fit them. == Layering The abstract protocol tries to be prescriptive (it reads that way) but this isn't a good approach when details of the concrete protocol show themselves. This shows most readily in responses but the general point is that the abstraction can't cover all the details of a concrete binding. I suggest just covering the SPARQL errors, showing how they map to HTTP response code and leave open that other HTTP response codes will occur and the same ones may occur for other reasons. For example, in HTTP, errors can be because of HTTP issues or because of SPARQL errors. The general HTTP response code should be non-normative: others can occur like 502 (Gateway error) (and other 500's) and 302 (moved temporarily - old style). These are all lower level issues and the application will have to deal with them as it sees fit. (http://sparql.org/query.html generates 502 if you're are too quick after a service restart - there is an Apache reverse proxy in front of the query server). The meaning and reuse of response codes can also be tricky. Example: 404 - What's not found? The model? One of the named graphs? The service? The HTTP spec says "The server has not found anything matching the Request-URI." and conventionally if the service is there, but a parameter is wrong (like graph=) there would not be, from HTTP's point of view, a 404 error. [Just found the great text "This status code is commonly used when the server does not wish to reveal exactly why the request has been refused, or when no other response is applicable."] As it is correct to return 404 when the service just isn't there, what happens in the other cases? I think there is a confusion of levels going on between the abstract service model and deployment environment. The abstraction can not define all situations that arise that are particular to a given deployment environment. Minor notes on response codes: + Need to include 414 (Request-URI Too Long)! + Not sure 202 (Accepted) makes sense as query is request-response. It certainly doesn't seem more important that some of the ones not mentioned. == Multiple Operations per Request This can be achieved efficiently in HTTP 1.1 by simply sending one request after another. The TCP connection is almost always open so that the overhead is just header parsing. The advantage of one query-re-request is that response codes are clearly associated with the actions. One query - one response. Otherwise, some queries work, others don't so each needs a response somehow. Given it is possible in HTTP 1.1, I don't see the need to add another layer that can also do multiple queries per request. I would be convinced by a use case as to what capability is enabled. == HTTP issues Still need POST form for large queries. Just using query-uri= does not work when firewalls are involved. == Misc What is the MINE type for N3? I found a quick survey in a IRC log which had more application/n3 than text/n3 but significant amounts of both. I found text/rdf+n3 from W3C yesterday. == HTTP Examples What happens when there is no Accept: header? I prefer this to mean: application/xml;application/rdf+xml,q=0.9 so a SELECT returns XML by default. Interactions: Do SPARQL-Distinct, SPARQL-Limit have the same meaning as in query language? What about interactions with HTTP mechanisms. I suggest leaving these out and avoiding interaction with concrete protocol mechanisms. There is going to be interactions between graph= and FROM/GRAPH/SOURCE. SPARQL queries:: ex 1.2 query: What if SPARQL-Distinct, SPARQL-Limit don't apply. Is it an error? I suggest ignoring them. Resources reference things not in the file - intended? ex 1.3 query: What is the semantics of one query, 3 graphs? I'd guess its three separate answers which suggests requests (and 3 response codes) on a single connection. The second can be sent immediately, not waiting for the first. ex 1.4 query: Same comment about using HTTP one request-one response mode. Can we have multiple queries against multiple graphs? N*M queries or one query per graph. GetGraph:: Is it the presence of a "query=" parameter that distinguishes getGraph from a SPARQL query? A lang= would make this explicit and would. ex 2.3 multipart/related? RFC 2387 says: The Multipart/Related media type is intended for compound objects consisting of several inter-related body parts. I don't see them as inter-related except that they are in the data for the same response. == Implementation Experience http://www.sparlq.org/query.html is a HTML form front to a service at http://www.sparlq.org/books. It is built on ARQ, Joseki3 using Jetty as a servlet container. Joseki3 is a bit rough in places because I expect to need to make changes as the protocol emerges. This mainly effects the configuration file that effects the service run and that is very user-visible. Joseki uses an RDF config file (N3 usually); it supports multiple query languages and each can have its own parameters. Queries come over GET or POST (in an RDF graph with large literal for the query string). The supported parameters are lang= and query= Queries are against a single fixed graph and graph= (single or multiple) and query-uri= are not handled. I'm not keen on loading arbitrary web resources into a general service processor so I see thatas an optional feature and woudl make it configurable (default off). All four result forms are handled, including XML and RDF/XML results for SELECT. Content negotiation is done (one hack - it snoops to see if a browser is asking by trying to see is "text" is requested - if so, you get text/plain and N3 so it displays without kicking off a helper app). N3 uses MIME type is application/n3. I intend to make it exactly the SPARQL specification and have an "exact" mode. I will also make it possible for the deployer to restrict features. I intend to make it more service-like when the relationship of protocol parameters and query language features is clearer. The biggest outstanding issue is FROM/GRAPH and "graph=" (because I don't see the multiple requests as being the best way to do it). I intend to do a SOAP interface. The main issue I can see is keeping the abstraction of query engine yet handling RDF/XML & XML results cleanly. I intend to have more time. Kendall Clark wrote: > Folks, > > Please find > > DRAFT: $Id: protocol-wd.html,v 1.8 2004/12/06 19:22:10 k Exp $ > > at > > http://monkeyfist.com/kendall/sparql-protocol-simplex/ > > Notable changes include more excision of unnecessary parts and some > major reorganizations of the remaining bits. The result is much > shorter and simpler. I also renamed HTTP query parameters: "q" -> > "query"; "g" -> "graph"; "q-uri" -> "query-uri". > > There's still plenty of substantive work to be done in sorting out > details, but I'd to find out whether this is likely to become a WG > working draft before doing much more of that. > > Kendall Clark >
Received on Thursday, 9 December 2004 14:57:13 UTC