Re: HTTP 1.0&1.1 URL safe characters conflict with HTML? from Larry Masinter on 1996-02-11 (ietf-http-wg@w3.org from January to March 1996)

From: Larry Masinter <masinter@parc.xerox.com>
Date: Sun, 11 Feb 1996 01:18:16 PST
To: dwm@shell.portal.com
Cc: fielding@avron.ICS.UCI.EDU, http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <96Feb11.011828pst.2733@golden.parc.xerox.com>

+ should be 'reserved'. The definition of 'reserved' is whether the
character has the same interpretation whether escaped or not escaped.
'reserved' characters may appear in URLs, and must only be escaped
when used within a context where it has a reserved meaning and would
otherwise be confused with that meaning. (This depends on the scheme;
for example '@' is has a reserved meaning in 'mailto:', but '/' does
not.

% is special, because it *is* the encoding character. It's labelled
'unsafe' because unsafe characters may not appear in URLs. You can say
'% may not appear in URLs' but of course, the "except for its use
within an escape sequence".

HTTP is unusual in that the HTTP protocol is defined to take the part
of the URL after the hostpart and just send it, unmodified; this
behavior is different from that of gopher:, ftp:, mid:, etc. in that
most other URL schemes call for the constituant parts to be unescaped
before transmission.

I recall having no end of problems with proxy servers that did 'too
much' processing on the URLs. I wonder if it is possible to reference
1738 and 1808 rather than replicate but modify the information
contained therein.

Received on Sunday, 11 February 1996 01:22:35 UTC