Compatibility between HTML, XForms, and WSDL

HTML4.01 (and probably XHTML, though I didn't find it) specifies that
a GET URI is constructed as such:
  action + '?' + url-encoded form parameters

thus, 
  <form action="http://q.example/what?foo">
    <input type="hidden" name="inference" value="owl-lite"/>
    <input name="q" default="SELECT%20%3"/>
  </form>

should call
  <http://q.example/what?foo?inference=owl-lite&q=SELECT%20%3>


An cursory implementation survey showed that Lynx 2.8.4, Opera 7.5.4,
Mozilla 1.7.5 all appear to chop off the '?' and everything after it.
  <http://q.example/what?inference=owl-lite&q=SELECT%20%3>


WSDL [WZ] follows XForms's [XF] example and specifies that the
separator between the action and the parameters be a '?' if there is
not one already, otherwise, a '&' .
  <http://q.example/what?foo&inference=owl-lite&q=SELECT%20%3>


I propose an errata to HTML to reflect either current practice or the
XForms way of doing it. I will propose that the SPARQL protocol [SP]
use the XForms approach as well.


RELEVANT SPECS:

RFC3896 Appendix A:
[[
URI           = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
...
query         = *( pchar / "/" / "?" )
]]

URLENCODE <http://www.w3.org/TR/html4/interact/forms.html#form-content-type>
[[
application/x-www-form-urlencode

This is the default content type. Forms submitted with this content
type must be encoded as follows:

   1. Control names and values are escaped. Space characters are
replaced by `+', and then reserved characters are escaped as described
in [RFC1738], section 2.2: Non-alphanumeric characters are replaced by
`%HH', a percent sign and two hexadecimal digits representing the
ASCII code of the character. Line breaks are represented as "CR LF"
pairs (i.e., `%0D%0A').

   2. The control names/values are listed in the order they appear in
the document. The name is separated from the value by `=' and
name/value pairs are separated from each other by `&'.
]]


HTML4.01 <http://www.w3.org/TR/html4/interact/forms.html#h-17.13.3.4>
[[
# If the method is "get" and the action is an HTTP URI, the user agent
takes the value of action, appends a `?' to it, then appends the form
data set, encoded using the "application/x-www-form-urlencoded"
content type. The user agent then traverses the link to this URI. In
this scenario, form data are restricted to ASCII codes.
]]

[WZ] http://dev.w3.org/cvsweb/~checkout~/2002/ws/desc/wsdl20/wsdl20-bindings.html?content-type=text/html;%20charset=utf-8#_http_x-www-form-urlencoded
[XF] http://www.w3.org/TR/2003/REC-xforms-20031014/slice11.html#serialize-urlencode
[SP] http://www.w3.org/TR/2005/WD-rdf-sparql-protocol-20050114/
-- 
-eric

office: +81.466.49.1170 W3C, Keio Research Institute at SFC,
                        Shonan Fujisawa Campus, Keio University,
                        5322 Endo, Fujisawa, Kanagawa 252-8520
                        JAPAN
        +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA
cell:   +81.90.6533.3882

(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.

Received on Thursday, 24 February 2005 03:33:58 UTC