WSDL and POSTing SPARQL Update requests directly

The first topic we discussed on the protocol teleconference was the 
content of an HTTP request for the SPARQL Update Protocol. The consensus 
was as listed at 
http://www.w3.org/2009/sparql/wiki/Protocol#What_do_you_send_for_a_SPARQL_update_request.3F 
:

"""
PROPOSED: SPARQL Update requests can be POSTed as either 
application/x-www-form-urlencoded or application/sparql-update, pending 
discovery of how to do the latter in WSDL 2
"""

The first half is the analog to SPARQL query. In that case, you send an 
HTTP request like

POST /foo/bar/sparql HTTP/1.1
...
Content-type: application/x-www-form-urlencoded

default-graph-uri=...&named-graph-uri=...&named-graph-uri=...&update=<URL encoded 
update request string>


There was a significant feeling that we'd like to also encourage direct 
POSTing of SPARQL Update strings:

POST /foo/bar/sparql HTTP/1.1
...
Content-type: application/sparql-update

<unencoded content of SPARQL Update string goes here>


We also observed that this 2nd form doesn't give anyway to specify the 
equivalent of the default-graph-uri and named-graph-uri parameters, so 
it was suggested that perhaps those could be encoded in the URI query 
string:

POST /foo/bar/sparql?default-graph-uri=...&named-graph-uri=... HTTP/1.1
...
Content-type: application/sparql-update

<unencoded content of SPARQL Update string goes here>


The SPARQL Protocol is currently normatively defined using WSDL 2.0. As 
no one on the group is a WSDL expert,  I took ACTION-341 
(http://www.w3.org/2009/sparql/track/actions/341) to investigate whether 
we could specify the above behavior(s) in WSDL.

 From what I've read, I do _not_ believe that this is a setup that can 
be specified using the HTTP bindings for WSDL 2.0.

The obvious idea was to specify 
whttp:inputSerialization="application/sparql-update" in the WSDL, where 
currently we would do something like 
whttp:inputSerialization="application/x-www-form-urlencoded" .

 From 
http://www.w3.org/TR/wsdl20-adjuncts/#_http_binding_default_rule_psf, I 
learn that we first need to consider the message content model that is 
used to define the (abstract) interface message. From 
http://www.w3.org/TR/2007/REC-wsdl20-20070626/#InterfaceMessageReference_element_attribute 
I learn that the message content model is specified by the value of the 
"element" attribute on the <input> element of the interface operation.

 From the SPARQL 1.0 Protocol WSDL - 
http://www.w3.org/2007/SPARQL/protocol-query.wsdl - I see that our WSDL 
uses element="st:query-request" which means that it's using #element 
message content model.

Going back to 
http://www.w3.org/TR/wsdl20-adjuncts/#_http_binding_default_rule_psf, 
I'm pointed to the rules in 6.4.3.1 (directly below), which in turn say:

"""
     * If the serialization format is 
"application/x-www-form-urlencoded", then the serialization of the 
instance data is defined by section 6.8.2 Serialization as 
application/x-www-form-urlencoded .
     * If the serialization format is "multipart/form-data", then the 
serialization of the instance data is defined by section 6.8.4 
Serialization as multipart/form-data .
     * If the serialization format is "application/xml", then the 
serialization of the instance data is defined by section 6.8.3 
Serialization as application/xml .
     * Otherwise, then the serialization of the instance data is defined 
by section 6.8.3 Serialization as application/xml with the following 
additional rule: the value of the HTTP Content-Type entity-header field 
is the value of the serialization format and its associated media type 
parameters, if any.
"""

There's no leeway here for serialization of the element content in 
anyway other than XML, multipart form data, or a query string.

OK, so the next question is whether we could change the WSDL to use a 
message content model other than #element. Looking back up at 6.4.3, 
#any has the same rules as #element, and #none requires empty content. 
So what about #other?

"""
If the value is "#other", then the serialization format and its 
associated media type parameters, if any, specifies the value of the 
HTTP Content-Type entity-header field as defined in section 14.17 of 
[IETF RFC 2616]. The serialization of the payload is undefined.
"""

So basically, for #other, all that the WSDL does is tell you what the 
Content-type of the request should be, and all other details are left 
unspecified. So we could legally use that with 
whttp:inputSerialization="application/sparql-update", which basically 
specifies practically nothing. If we were to go that route, then 
combined with the fact that we've already decided to remove the explicit 
SOAP section from the document, I think we'd be better off losing the 
WSDL definition entirely.

Given this, the 2nd half of the question becomes less critical, but it 
had an easier answer. From 
http://www.w3.org/TR/wsdl20-adjuncts/#_http_serialization, I see that 
there's (in general) no problem encoding some of the parameters of an 
input message in the request URI and others in the message body.

But anyway, I see two main options given this, neither of which is 
totally satisfying:

Option 1:

Rewrite the SPARQL protocol to be defined normatively without using WSDL 
2.0.
   Pros: We can specify the exact behavior that we want. The protocols 
are simple enough that this is not overly complex.
   Cons: This is a large editing job, and we already have scheduling and 
resource issues with getting protocol to Last Call. Also, I'm sure there 
are many specification details of "the right way to use HTTP" that are 
taken care of us automatically by leaning on WSDL that we might have to 
be explicit about in this case.

Option 2:

Only support the application/x-www-form-urlencoded scenario in the 
SPARQL 1.1 Protocol. SPARQL Update request strings can be used directly 
as per the informative text in the Uniform HTTP Protocol - 
http://www.w3.org/2009/sparql/docs/http-rdf-update/#http-patch . 
However, this use is limited to modifying a single graph that is 
identified by the request.
   Pros: Easiest path forward.
   Cons: Does not give a well-defined, interoperable way to directly 
POST SPARQL Update request strings.

thoughts?

Lee

Received on Friday, 24 December 2010 19:22:32 UTC