www-form-urldecode and parameters with non-NCNames

I recently discovered that the www-form-urldecode step requires that
parameters are named with NCNames.
http://www.w3.org/TR/xproc/#c.www-form-urldecode

The "application/x-www-form-urlencoded" media type originates in the HTML
specification, where the parameter names are the names of "controls" (form
widget elements) in HTML markup, which are declared to be not NCNames but
merely CDATA.

I'd like to suggest that the next XProc could relax that requirement, to
allow the step to parse parameters with arbitrary names, including ones
which would not work as XProc parameters without some further processing.

I'd suggest using a boolean option to allow non-NCNames to be parsed
without exception, analogously to the use of "failure-threshold" option in
p:exec, and the "assert-valid" options in the various schema steps. e.g.

<p:declare-step type="p:www-form-urldecode">
     <p:output port="result"/>
     <p:option name="value" required="true"/>                      <!--
string -->
     <p:option name="require-valid-names" select="'true'"/>        <!--
boolean -->
</p:declare-step>
I can understand the desire to be able to use URL-encoded parameters as
XProc parameters, but it is a shame that the step can't also handle
parameters whose names are not NCNames. This precludes using spaces in
names, or using URIs as names, and so on, and it means that XProc's
perfectly reasonable restriction on XProc parameter names gets extended to
the web browsers at the front end, where it has no business.

Incidentally, I wrote a general "application/x-www-form-urlencoded" parser
myself (in XProc) and it was a pain in the arse. Decoding UTF-8 argh!

Received on Tuesday, 9 December 2014 14:00:21 UTC