I am concerned about the definition of application/x-www-form-urlencoded. HTML 2.0 and HTML 4.01 both say: space characters are replaced by `+', and then reserved characters are escaped as described in RFC 1738: non-alphanumeric characters are replaced by `%HH'... Which is it, reserved characters or non-alphanumeric characters? Either way, the specified process is not reversible, because it perfoms %HH escaping *after* changing spaces to plus-signs. For example, the values "foo+bar" and "foo bar" map to the same thing, either "foo+bar" (if plus-sign is not escaped), or "foo%2Bbar" (if plus-sign is escaped). As far as I know, browsers always violate the spec and do something reversible instead: they do the %HH escaping *before* changing spaces to plus-signs, and they include plus-sign in the set of characters to be escaped. That way, the server can distinguish between "foo%2Bbar" (which means "foo+bar") versus "foo+bar" (which means "foo bar"). Am I correctly understanding the spec, that the specified encoding is non-reversible? Is my observation about browsers accurate, that in practice they always use a reversible encoding? Should this discrepancy be addressed in some W3C note? The XForms draft resolves the reserved/non-alphanumeric question, but retains the non-reversibility: space characters are replaced by +, and then non-ASCII and reserved characters (as defined by [RFC 2396] as amended by subsequent documents in the IETF track) are escaped by replacing the character with one or more octets of the UTF-8 representation of the character, with each octet in turn replaced by %HH... AMCReceived on Sunday, 21 September 2003 20:37:55 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 31 October 2007 00:16:57 GMT