Re: x-www-form-urlencoded unsuited for basichttp [was IRP for uSCXML]

On Jun 24, 2014, at 23:24, Jim Barnett <jim.barnett@genesys.com> wrote:

> The current spec says that <content> should be passed as the "body of the message".  As I recall, we thought that whatever came after the application/x-www-form-urlencoded parameters counted as the body of the message (and had no special encoding).  Is this totally wrong?  If there's some way to salvage the existing wording in the spec (but to change the tests), I'd like to do it, because if we change the spec we'll have to issue another last call (or at least pull the BasicHTTPEventI/O processor out into a separate document.)
> 

I am by no means an expert in the various iterations of the HTTP protocol, but the “form” in x-www-form-urlencoded seems like a pretty good indication that the content ought to consist of KVPs as form fields and their values. I did not find any other specification than the one in [1]. There is no content “after the application/x-www-form-urlencoded parameters”, and as such no "body of the message”, the encoded parameters *are* the content.

This is a good match for <param> and send.namelist, but unsuited for <content> as JSON, XML or plain text, where the usual content-types of “text/TYPE” seem more suited.

A pragmatic solution, with the current wording of the spec might be to:
- indeed introduce <param> and send.namelist as HTTP headers
- just url-encode content with a key of “content” if there is any
- have _scxmleventname as a HTTP header if there is an event name

Though, in order to pass anything more elaborate as a param or send.namelist, we’d need to ultimately urlencode their values as HTTP headers as well. Furthermore, most implementations limit the HTTP headers length (8k for Apache, 16k for IIS[2]).

But this seems wrong to me on various levels:
- we are polluting the HTTP header namespace and even potentially colliding
- x-www-form-urlencoded is a good match for <param> and send.namelist, but we send them as HTTP headers
- we are abusing the HTTP headers to carry the payload of the message
- sending <content> as a single urlencoded KVP with a superfluous key seems .. weird 

Could someone with a more profound knowledge of HTTP have a look at these things please?
  Stefan

[1] http://tools.ietf.org/html/rfc1866#section-8.2.1
[2] http://stackoverflow.com/questions/686217/maximum-on-http-header-values

> - Jim
> 
> -----Original Message-----
> From: Stefan Radomski [mailto:radomski@tk.informatik.tu-darmstadt.de] 
> Sent: Tuesday, June 24, 2014 5:05 PM
> To: Jim Barnett
> Cc: www-voice@w3.org (www-voice@w3.org)
> Subject: x-www-form-urlencoded unsuited for basichttp [was IRP for uSCXML]
> 
> On Jun 24, 2014, at 21:35, Jim Barnett <1jhbarnett@gmail.com> wrote:
> 
>> On test520, are you saying that
>> 
>> 'this%20is%20some%20content'
>> 
>> is the correct form for application/x-www-form-urlencoded header and body?  If so, I'll have to change the tests to match that.  If I put the %20 in the inline content, it comes through in the .scxml file, but I want to make sure that this  is the right thing to do before I publish the change.
>> 
> 
> Nahh, scratch that - major misconception of 'application/x-www-form-urlencoded' on my part! But the spec is indeed confusingly worded. A valid POST request for 'application/x-www-form-urlencoded'[1] data might look like this:
> 
> ---
> POST /path/foo HTTP/1.1
> Content-Type: application/x-www-form-urlencoded
> Content-Length: 40
> 
> content=this+is+some+content&test=bar+baz
> ---
> 
> Historically, the post request's content for x-www-form-urlencoded is already a set of key/value pairs, paired by '=' and separated by '&' with some escaping, which makes sense as it was used to send data submitted by HTML forms. For SCXML, it might not suffice to send only KVPs, consider:
> 
> <send event="foo" ..>
>  <param name="param1" expr="this is some content" /> </send>
> 
> and
> 
> <send event="foo" ..>
>  <content>this is some content<content/> </send>
> 
> 
> As per spec, <content> and namelist / <param> can not occur together in a <send>, but on the receiving side, we need to distinguish "event.data" from "event.data.param1", so content passed via <content> has to be treated differently. As far as I can see "application/x-www-form-urlencoded" requires KVPs and there is no way to send literal content without a key.
> 
> The most pragmatic solution I can see is to use verbatim, unencoded post content with a proper "Content-type" if it is passed via <content> and classical x-www-form-urlencoded post content if it's from a namelist or <param>. As for proper "Content-type" with <content>, I'd say we support at least "text/plain", "text/xml" and "text/javascript". And I'd put the SCXML event name in a custom HTTP post parameter "X-SCXML1.0-EventName=test".
> 
> In any case "application/x-www-form-urlencoded" as it is defined in [1] is unsuited because we cannot distinguish event.data from event.data.param1.
> 
> Any thoughts?
>  Stefan
> 
> [1] http://tools.ietf.org/html/rfc1866#section-8.2.1
> 
>> As for event names, section 3.12.1 states that they are case sensitive (so that an 'event' attribute on a transition of 'error' will match 'error.send' but not 'errOR.send').
>> 
>> - Jim
>> 
>> On 6/24/2014 2:21 PM, Stefan Radomski w
>> 
>> rote:
>>> On Jun 24, 2014, at 16:58, Jim Barnett <1jhbarnett@gmail.com> wrote:
>>> 
>>>> For test350, does your transform by any chance give you any information on where the error is?  Mine doesn't, so I'm modifying things randomly, trying to find the problem without much luck.
>>> First it complained about the character encoding, after resaving as UTF-8 I got:
>>> Error on line 10 column 2 of test350.txml:
>>>  SXXP0003: Error reported by XML parser: The value of attribute "conf:systemVarExpr"
>>>  associated with an element type "null" must not contain the '<' character.
>>> Transformation failed: Run-time errors were reported
>>> 
>>>> For test513, the extension is defined in section 5.3 of the IR Test Plan, in the paragraph titled "Extensions Required".
>>> That's ok - we already implement _event.raw for basichttp but we'll skip on httpResponse for send - we already implement <fetch> as suggested by David. There is also an occurence of _event.raw with the *SCXML I/O-processor* in the tests iirc, if you feel like it's important, I can look it up.
>>> 
>>>> For 519, 520, 531, and 534, in what way is the behavior underspecified?   We thought the definitions in C.2.1 and C.2.2 were sufficient to define the behavior.  Which statements are vague or underspecified?
>>> Well, what is specified for the basichttp ioproc:
>>> 1. uses post with an application/x-www-form-urlencoded body when 
>>> sending 2. if the event attribute is given, it becomes a 
>>> _scxmleventname post header parameter 3. all namelist and param 
>>> values become post header parameter as well 4. content becomes the 
>>> application/x-www-form-urlencoded body of the http post request 5. if 
>>> a single instance of the parameter _scxmleventname is present when 
>>> receiving it becomes the event's name
>>> 
>>> With our implementation, the complete, raw post request is available in _event.raw.
>>> 
>>> What is tested is:
>>> 
>>> 1. from test519:
>>> <send event="test" ...>
>>>  <param name="param1" expr="1"/>
>>> </send>
>>> 
>>> ought to end up as a string 'Varparam1=1' in _event.raw - we do have a 'param1=1' http post parameter as per 3. No idea where that 'Var' prefix is coming from.
>>> 
>>> 2. from test520:
>>> <send ...>
>>>  <content>this is some content</content> </send>
>>> 
>>> ought to lead to an event HTTP.POST and raw ought to contain 'this is some content'. We do have 'this%20is%20some%20content' as per 1&4. On a related note: are event names case-sensitive, in our implementation they are not, but I am not sure whether this is actually specified somewhere.
>>> 
>>> 3. from test531:
>>> <send ...>
>>>  <content>_scxmleventname=test</content>
>>> </send>
>>> 
>>> ought to lead to an event called "test", but content is to be application/x-www-form-urlencoded and point 5 from above refers to post parameters, not encoded content? Our implementation will happily take an event attribute with the send element, a namelist entry or a param to create a _scxmleventname post parameter, but not urlencoded content.
>>> 
>>> 4. from test534:
>>> <send event="test" ... />
>>> 
>>> ought to lead to an event containing 'Var_scxmleventname=test' in _event.raw but nowhere is that 'Var' prefix defined.
>>> 
>>> Feel free to correct any misconceptions - my HTTP encoding knowledge is somewhat rusty and I'd prefer to pass those tests if you insist that they actually test what is specified in C2.1 / C2.2.
>>> 
>>> Regards
>>> Stefan
>>> 
>>>> On 6/23/2014 7:46 PM, Stefan Radomski wrote:
>>>>> Hi there,
>>>>> 
>>>>> attached is the IRP report for uSCXML with the ecmascript datamodel. We pass most tests but:
>>>>> - the one where the XPath DM is hardcoded
>>>>> - test350 which fails to XSLT transform
>>>>> - test513 which tests for an unspecified http extension
>>>>> - test[519,520,531,534] which rely on an imho underspecified 
>>>>> behavior of the basichttp ioprocessor
>>>>> - test579 with the recent addition of history transitions
>>>>> 
>>>>> We do implement parts of the XPath DM but prefer not to submit an implementation report for it.
>>>>> 
>>>>> Tests were:
>>>>> - performed on a Mac with JavaScriptCore for the ECMAScript 
>>>>> datamodel implementation
>>>>> - validated on Linux with Google's v8 for the ECMAScript datamodel 
>>>>> implementation
>>>>> - fetched at 06/23/2014 - 4:22pm
>>>>> - run with uSCXML commit id 
>>>>> 3d3f6a693ac51bca9b77133783a0fb296abd7ff6
>>>>> 
>>>>> This submission is done to the best of our knowledge - if you cannot reproduce these tests, or there is a formal error with the submission please contact us.
>>>>> 
>>>>> Best regards
>>>>> Stefan
>>>>> 
>>>> --
>>>> Jim Barnett
>>>> Genesys
>> 
>> --
>> Jim Barnett
>> Genesys
> 
> 
> 

Received on Tuesday, 24 June 2014 22:20:37 UTC