W3C home > Mailing lists > Public > www-html@w3.org > April 2000

Re: URLencoding.

From: Nir Dagan <nir@nirdagan.com>
Date: Thu, 06 Apr 2000 23:38:18 -0400
Message-Id: <4.2.2.20000406233201.00a38a80@nirdagan.com>
To: "Dave Bridger" <dbridger@inlink.com>, <www-html@w3.org>
You may prefer to check out the 
latest URI syntax RFC http://www.ietf.org/rfc/rfc2396.txt
It is very clear on the hex escaping issue. Also there are 
some changes from RFC1738.

Regards,
Nir Dagan

At 10:24 PM 4/6/00 -0500, Dave Bridger wrote:
>I am attempting to determine exactly which special characters should be escaped
>to Hex and which should not be escaped during urlencoding. The HTML 4.01
>Specification is very unclear and RFC1738 does not help at all. The mailing list
>archive produces only a partial thread which only partly help to clarify the
>situation.
>
>A quick Web search indicates that others are also not clear about urlencoding.
>The prevailing practice seems to be to escape everything except alphanumerics
>and space which becomes +. For example, see the JAVA urlencoding class at:
>
>http://www.javasoft.com/products/jdk/1.0.2/api/java.net.URLEncoder.html
>
>Fortunately RFC1738 is permissive so the overencoding practice will not harm
>anything.
>
>Can anyone give me a definitive answer as to which characters need not be
>escaped?
>
>Perhaps Section 17.3.4 of the HTML Spec should be clarified.
>
>TIA
>--Dave
>
>

===================================
Nir Dagan
Assistant Professor of Economics
Brown University 
Providence, RI
USA

http://www.nirdagan.com
mailto:nir@nirdagan.com
tel:+1-401-863-2145
Received on Thursday, 6 April 2000 23:36:06 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 March 2012 18:15:43 GMT