W3C home > Mailing lists > Public > public-appformats@w3.org > September 2006

WF2: application/x-www-form-urlencoded encoding ill-defined

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Wed, 06 Sep 2006 03:47:10 +0200
To: public-appformats@w3.org
Message-ID: <po8sf2ph9g2u2o4iomndmdck0ur0gnn7rs@hive.bjoern.hoehrmann.de>

Dear Web Application Formats Working Group,

  http://www.w3.org/TR/2006/WD-web-forms-2-20060821/ section 5.3 item 4

  Control names and values are escaped. Space characters are replaced by
  "+" (U+002B), and other non-alphanumeric characters are encoded in the
  submission character encoding and each resulting byte is replaced by
  "%HH", a percent sign (U+0025) and two uppercase hexadecimal digits
  representing the value of the byte.

This text is rather unclear and incorrect; it does not define what non-
alphanumeric characters are (and whatever it means, it's incorrect), the
character encoding is applied to the whole string, not just non-alpha-
numeric characters, and %hh encoding is applied based on what the bytes
are, not what the character were.

Consider the following cases:

  * encoding is UTF-8 and the value is "_", implementations should not
    apply %hh encoding to it even though it's not alphanumeric

  * encoding is UTF-7 and the value is "ö", the byte sequence would be
    +APY- and implementations should apply %hh escaping only to the +,
    not to the whole thing or nothing (depending on whether "ö" is con-
    sidered alphanumeric)

Please change the draft in a way that properly reflects the above and
current implementations. I don't know the exact set of bytes that need
to have %hh encoding applied, but I suspect the set is similar to that
of characters considered reserved in the query string as per RFC 3986.

Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Weinh. Str. 22 · Telefon: +49(0)621/4309674 · http://www.bjoernsworld.de
68309 Mannheim · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 
Received on Wednesday, 6 September 2006 01:54:07 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:50:05 UTC