- From: NARUSE, Yui <naruse@airemix.jp>
- Date: Wed, 09 Sep 2009 23:33:39 +0900
Anne van Kesteren wrote: > On Tue, 08 Sep 2009 21:40:22 +0200, NARUSE, Yui <naruse at airemix.jp> wrote: >> First is about 4.10.16.4 URL-encoded form data. >> http://www.whatwg.org/specs/web-apps/current-work/#application/x-www-form-urlencoded-encoding-algorithm >> >> >> In this algorithm at 6.2.1, >> "SP, *, -, ., 0 .. 9, A .. Z, _, a .. z" is not escaped. >> But many other specs which use application/x-www-form-urlencoded refers > > Which other specifications? Following specifications. (sorry some of them are earlier RFC) XForms 1.0 http://www.w3.org/TR/xforms/#serialize-urlencode "then non-ASCII and reserved characters (as defined by [RFC 2396] as amended by subsequent documents in the IETF track) are escaped" -> so RFC3986 HTML 4 http://www.w3.org/TR/html401/interact/forms.html#h-17.13.4.1 "reserved characters are escaped as described in [RFC1738]" RFC1738 http://www.faqs.org/rfcs/rfc1738.html unreserved = alpha | digit | safe | extra safe = "$" | "-" | "_" | "." | "+" extra = "!" | "*" | "'" | "(" | ")" | "," TAG Finding "refer to section 2.1 of [RFC2396]." http://www.w3.org/2001/tag/doc/whenToUseGet.html#i18n RFC2396 http://www.faqs.org/rfcs/rfc2396.html unreserved = alphanum | mark mark = "-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")" WSDL 2.0 http://www.w3.org/TR/wsdl20-bindings/#_http_x-www-form-urlencoded "Replacement values falling outside the range (ALPHA and DIGIT below are defined as per [IETF RFC 4234]): ALPHA | DIGIT | "-" | "." | "_" | "~" | "!" | "$" | "&" | "'" | "(" | ")" | "*" | "+" | "," | ";" | "=" | ":" | "@", MUST be percent-encoded." >> URI's unreserved. And it in RFC3986 is >> unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~" >> Why ~ is escaped and * is not escaped? > > What do browsers do? IE8 QUERY_STRING: t=+%21%5C%22%5C%23%24%25%26%27%28%29*%2B%2C-.%2F0123456789%3A%3B%3C%3D%3E%3F at ABCDEFGHIJKLMNOPQRSTUVWXYZ%5B%5C%5C%5D%5E_%60abcdefghijklmnopqrstuvwxyz%7B%7C%7D%7E not escaped: *-. at _ Firefox 3.5 QUERY_STRING: t=+%21%5C%22%5C%23%24%25%26%27%28%29*%2B%2C-.%2F0123456789%3A%3B%3C%3D%3E%3F%40ABCDEFGHIJKLMNOPQRSTUVWXYZ%5B%5C%5C%5D%5E_%60abcdefghijklmnopqrstuvwxyz%7B%7C%7D%7E not escaped: *-._ Chrome2 QUERY_STRING: t=+%21%5C%22%5C%23%24%25%26%27%28%29*%2B%2C-.%2F0123456789%3A%3B%3C%3D%3E%3F%40ABCDEFGHIJKLMNOPQRSTUVWXYZ%5B%5C%5C%5D%5E_%60abcdefghijklmnopqrstuvwxyz%7B%7C%7D%7E not escaped: *-._ Opera9 QUERY_STRING: t=+%21%5C%22%5C%23%24%25%26%27%28%29%2A%2B%2C-.%2F0123456789%3A%3B%3C%3D%3E%3F%40ABCDEFGHIJKLMNOPQRSTUVWXYZ%5B%5C%5C%5D%5E_%60abcdefghijklmnopqrstuvwxyz%7B%7C%7D%7E not escaped: -._ Hmm, Firefox and Chrome follow this, IE adds @, Opera removes *. If this spec use safer side, * may be also escaped. >> Third is about Web addresses in HTML 5. (this spec is also this ML?) >> http://www.w3.org/html/wg/href/draft > > You want public-iri at w3.org or public-html at w3.org for that draft. Thanks, I'll send it. -- NARUSE, Yui <naruse at airemix.jp>
Received on Wednesday, 9 September 2009 07:33:39 UTC