- From: <bugzilla@jessica.w3.org>
- Date: Wed, 02 May 2012 20:41:21 +0000
- To: public-html-bugzilla@w3.org
https://www.w3.org/Bugs/Public/show_bug.cgi?id=16909 --- Comment #2 from Evan Jones <evanj@csail.mit.edu> 2012-05-02 20:41:21 UTC --- Argh; whoops. Sorry for the bugzilla spam. I didn't realize that the "comment" thingy just filed a bugzilla bug. HTML5 states: "Encode the (now mutated) form data set using the rules described by RFC 2388". However, it then modifies the rules: "The parts of the generated multipart/form-data resource that correspond to non-file fields must not have a Content-Type header specified. Their names and values must be encoded using the character encoding selected above (field names in particular do not get converted to a 7-bit safe encoding as suggested in RFC 2388)." http://www.whatwg.org/specs/web-apps/current-work/multipage/association-of-controls-and-forms.html#multipart-form-data So the problem is: what are we supposed to do with field names? In particular, what if they contain "special" MIME characters (e.g. \r\n newlines, backslashes, double quotes, or semi-colons?). Different browsers do different things, meaning that currently server code must detect the browser to do the right thing. Example: <input name='bàz%22\"\' value="foo"> Firefox 13b: Content-Disposition: form-data; name="bàz%22\\"\" Webkit nightly: Content-Disposition: form-data; name="bàz%22\%22\" Firefox backslash quotes double quotes, except it fails to quote backslashes. This means its header fails to parse according to the MIME specification (it sort of decodes as bàz%22\ with an extra trailing \" Webkit %-escapes the double quotes, but does not %-escape the percent. Thus the above form control could be either name='bàz"\"\' or the desired name. Webkit has a bug open on this issue, asking for specification guidance: https://bugs.webkit.org/show_bug.cgi?id=62107 HTML5 should specify exactly how field names are encoded. Some potential solutions: 1) Bless Firefox's backslash quoting rules (they are very weird but I think they are unambiguous?). This means Webkit POSTs will be decoded to the wrong field names, and POSTs to older servers may parse incorrectly if the name includes a \ (but that must already happen for Firefox?). 2) Bless Webkit's percent escaping rules (ideally also escaping %). Servers that strictly parse this format will fail to parse Firefox POSTs if the name includes a \, and will 3) Adopt RFC 6266's approach of having two name parameters when there are special characters: one with the existing escaping, and one with an unambiguously escaped version. Ideally, existing servers will parse the first name and not break unless the form value contains a special character. As servers are upgraded, they will be able to unambiguously parse the new header. See: http://tools.ietf.org/html/rfc6266 Aside: The *same* issue happens for uploaded file names. I started a mailing list thread to attempt to collect more information about this: http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-May/035610.html -- Configure bugmail: https://www.w3.org/Bugs/Public/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the QA contact for the bug.
Received on Wednesday, 2 May 2012 20:41:25 UTC