W3C home > Mailing lists > Public > www-html@w3.org > April 2000

URLencoding.

From: Dave Bridger <dbridger@inlink.com>
Date: Thu, 6 Apr 2000 22:24:18 -0500
Message-ID: <000301bfa040$d0e62cc0$0100a8c0@pent100.bridger>
To: <www-html@w3.org>
I am attempting to determine exactly which special characters should be escaped
to Hex and which should not be escaped during urlencoding. The HTML 4.01
Specification is very unclear and RFC1738 does not help at all. The mailing list
archive produces only a partial thread which only partly help to clarify the
situation.

A quick Web search indicates that others are also not clear about urlencoding.
The prevailing practice seems to be to escape everything except alphanumerics
and space which becomes +. For example, see the JAVA urlencoding class at:

http://www.javasoft.com/products/jdk/1.0.2/api/java.net.URLEncoder.html

Fortunately RFC1738 is permissive so the overencoding practice will not harm
anything.

Can anyone give me a definitive answer as to which characters need not be
escaped?

Perhaps Section 17.3.4 of the HTML Spec should be clarified.

TIA
--Dave
Received on Thursday, 6 April 2000 23:23:32 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 March 2012 18:15:43 GMT