[Bug 12819] How do you decode this format on the server? There seems to be no definition of the format, apart from the definition of how to encode it. Expecting every implementer to reverse this algorithm seems prone to mistakes.

http://www.w3.org/Bugs/Public/show_bug.cgi?id=12819

Ian 'Hixie' Hickson <ian@hixie.ch> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ian@hixie.ch

--- Comment #2 from Ian 'Hixie' Hickson <ian@hixie.ch> 2011-08-12 20:49:08 UTC ---
Wow, looks like nobody's ever registered application/x-www-form-urlencoded,
HTML4 doesn't define how to parse it, and there's no other documentation worth
anything on it either.

Ok I guess we should register the type in the IANA considerations section, and
then in 4.10.22.5 URL-encoded form data add a paragraph and list at the end
saying how to decode it. Should probably mention _charset_ there too. While I'm
at it maybe also add a similar section for multipart/form-data (saying to see
the RFC), and for text/plain (saying it's ambiguous and can't be parsed).

So the parsing rules here should be:

 - cut on &s => list of name-value pairs
 - cut name-value pairs on =s limit 1 => names, values
 - replace +s in names, values with 0x20
 - expand %xxs to corresponding bytes
 - look for _charset_ name, treat value as encoding if found. otherwise use the
encoding determined by magic
 - decode names, values per that encoding

Might want to mention the isindex exception? Maybe not.

-- 
Configure bugmail: http://www.w3.org/Bugs/Public/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.

Received on Friday, 12 August 2011 20:49:10 UTC