Re: HTTP 1.0&1.1 URL safe characters conflict with HTML? from David W. Morris on 1996-02-13 (ietf-http-wg@w3.org from January to March 1996)

From: David W. Morris <dwm@shell.portal.com>
Date: Tue, 13 Feb 1996 00:48:12 -0800 (PST)
To: Larry Masinter <masinter@parc.xerox.com>
Cc: fielding@avron.ICS.UCI.EDU, http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <Pine.SUN.3.90.960213002937.22538A-100000@jobe.shell.portal.com>

On Sun, 11 Feb 1996, Larry Masinter wrote:

> > I'm sorry, but the + is also an encoding character based on RFC 1866
> > and current practice. It encodes anoother unsafe character, the SPace.
> 
> > We could debate forever whether we need both, but we have both.
> > And I suspect that the + for SPace is an effective optimization since
> > many forms require spaces but few +es.
> 
> Every character encodes something. The character 'A' really is just an
> encoding of the octet 65, after all.
> 
> I'm not sure what your point is, though. I think HTTP servers and
> proxies need to be aware of '%xx' encoding, in that some URLs might
> have extra %xx encodings sent to them, etc.
> 
> On the other hand, HTTP servers may not need to know much about '+'
> except that some HTML user agents do some kind of processing which
> produces them.

A server is the composite of functionality which responds to HTTP
requests. A user-agent/client is the composite of functionality which
generates requests. My para-phrase of course.  If the + is not
escaped by the client then the server may insert a blank where 
it shouldn't when it decodes the reqwuest. If the client doesn't
escape %41 in the input and send it as %6c41, then the server is
likely to interpret %41 as A. That makes + as likely to break
communication as % and hence unsafe.

Dave

Received on Tuesday, 13 February 1996 00:55:41 UTC