- From: Jeen Broekstra <jeen@aduna.biz>
- Date: Wed, 18 Jan 2006 17:01:47 +0100
- To: andy.seaborne@hp.com
- CC: Kendall Clark <kendall@monkeyfist.com>, dawg mailing list <public-rdf-dawg@w3.org>
Seaborne, Andy wrote: > Kendall Clark wrote: >> Folks, >> >> Mark Baker suggests [1] that we should add a SHOULD requirement that >> queryHttpPost binding should be used "where the cost of processing >> the query may be prohibitive". I don't really agree with this, since >> there's no way to no statically which are the expensive and which are >> the cheap queries. Even very sophisticated query analysis can't tell >> you which RDF datasets are expensive to assemble. > > Very true. It's not just the query that determines whether it will be > expensive - it's the dataset as well (and the sever load). > > [I confess I don't see why POST is better than GET for expensive > operations except that timeouts are not at the mercy of caches as well.] If I understand correctly, the main argument is that it guards against potentially expensive "flippant" requests (e.g. by automated clients such as by robots and spiders) because such agents only use GET requests (and since such agents are typically not aware of what to put in a POST request). So you guard your service against being bombarded with expensive requests by such spiders and robots by 'hiding' these requests and only exposing them through POST. I don't consider that a very compelling reason: it sounds more like a hack than a solution that an official protocol should recommend. Furthermore it is doubtful IMHO that any (automated) client that does not know SPARQL will generate very expensive requests, and the ones that do know SPARQL will not be blocked because they will also know how to do POST requests. >> And, further, I don't know of any way to programmatically redirect >> expensive GETs to POSTs (you can send a Location: header to the POST >> endpoint, if it's different from the GET endpoint, but I don't think >> that *really* suffices; alternately, we could define a WSDL fault, >> UsePost, but that seems an awful kludge), and I don't really see the >> *point* of doing so either, since if the query is too expensive, it's >> too expensive, whether it comes in via GET or POST. >> >> Mark retorts [2] that the "safety" of GET includes expensive >> operations, citing some message from Roy Fielding, but I think the >> message undercuts Mark's use of it, since it's very clearly about >> implementations of services, not about the semantics of their >> interfaces. >> >> Pat +1'd the proposal, but that was before further discussion, so I'm >> not certain where he would be now. I'm opposed to the inclusion that >> Baker suggests, for the reasons I've stated, but I will leave it to >> the WG to decide. > > SHOULD language (meaning "carefully weigh the situation before choosing > a different course") is acceptable if that reflects good web practice; > not having the text on the grounds that you believe that there isn't > anything sufficiently SPARQL related is also acceptable. FWIW I don't consider using POST as some sort of 'back door' for potentially vulnerable-to-spiders-and-DOS-attacks resources good Web practice. (I actually think that using POST for queries _at all_ is not very elegant, but acceptable only because there is no other way to send long query strings to the server). So I'd be in favor of not putting this in the spec. By the way: if an implementation would still choose this approach (that is, blocking 'expensive' requests on GET but allowing them on POST), there is nothing in the current spec that really prohibits that, is there? We say 'should' not 'must'. Jeen
Received on Wednesday, 18 January 2006 16:09:17 UTC