W3C home > Mailing lists > Public > www-html@w3.org > April 2000

Re: URLencoding.

From: Walter Ian Kaye <walter@natural-innovations.com>
Date: Fri, 7 Apr 2000 00:50:12 -0700
Message-Id: <v04220800b5133fe6edb4@[63.193.119.97]>
To: www-html@w3.org
At 10:24p -0500 04/06/00, Dave Bridger didst inscribe upon an 
electronic papyrus:
>A quick Web search indicates that others are also not clear about urlencoding.
>The prevailing practice seems to be to escape everything except alphanumerics
>and space which becomes +.

That would be very bad and severely broken.

>For example, see the JAVA urlencoding class at:
>
>http://www.javasoft.com/products/jdk/1.0.2/api/java.net.URLEncoder.html
>
>Fortunately RFC1738 is permissive so the overencoding practice will not harm
>anything.

Depends where the encoding is being done. Must be within components 
of the URL, not between them. http://www.acme.com%2Findex.html would 
be broken, e.g.

>Can anyone give me a definitive answer as to which characters need not be
>escaped?

Uhh... it depends. <g> There are characters which must always be 
escaped, and there are others which must sometimes be escaped. It 
also depends on whether you're talking about cgi query strings, and 
whether the URL is appearing within an HTML link or elsewhere (such 
as within email or in a database).

AOL Instant Messenger is overly conservative -- it doesn't allow 
semicolons in a URL, even though it's valid (and even recommended in 
RFC 1866).


-Walter
  http://www.natural-innovations.com/keymail.cgi?key=wik;ref=www-html%20msg
Received on Friday, 7 April 2000 03:51:06 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 March 2012 18:15:43 GMT