Re: Comments on the HTTP/1.0 draft. from Chuck Shotton on 1994-11-29 (ietf-http-wg@w3.org from October to December 1994)

From: Chuck Shotton <cshotton@oac.hsc.uth.tmc.edu>
Date: Tue, 29 Nov 1994 14:58:07 -0600
To: Marc VanHeyningen <mvanheyn@cs.indiana.edu>
Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <ab013ccf0a0210044e22@[129.106.30.2]>

>>With regard to comment number 2, the encoding of object-body parts, there
>>is a non-trivial ambiguity in RFC 1630 regarding the encoding of spaces as
>>"+", and where this is allowed. For WWW clients that encode object-bodies
>>using the URL-encoding scheme, behavior is inconsistent. Some clients
>>encode specials in the object-body text using %xx hex encodings
>>exclusively. Others use %xx encodings for all specials except space, and
>>encode spaces as "+".
>
>I disagree strongly with this interpretation.  A + in search terms
>represents a keyword separator, and has nothing to do with a space,
>which is (of course) represented as %20.  The fact that some WWW
>clients choose to have a space be the device by which the user
>communicates keyword separations to the client is irrelevant; it could
>just as well be a tab, or a comma, or clicking in a different box.
>(The fact that some WWW clients don't allow any way for a keyword to
>contain a space reflects a lack of flexibility.)

Actually, we agree. "+" and space are NOT equivalent. The problem is that
Mosaic and its derivative works (including NetScape, derived from the
programmers rather than the source) all encode spaces as + in object-body
parts. The "+" token is very clearly intended to be a search term
separator, as specified in the URI RFC. Just because "+" is the
representation of spaces from the original Mosaic's data entry dialog for
searches is coincidence. As you say, Mosaic could have prompted repeatedly
for single search terms, concatenating them with "+".

However, we are talking about slightly different subjects. I am
specifically requesting a clarification on what it means to have
object-body content that uses "URL-Encoding", and whether or not the usage
of "+" as an encoding for spaces is acceptable in an object-body part. I
have always felt that it is incorrect to use "+" for ANYTHING but keyword
separators in the search term portion of a URL. "+" in an object-body that
is URL-encoded should be represented as %2B and spaces as %20. This would
avoid any confusion with CGIs that interpret + as space, though it would do
little to keep clients from emitting them in the first place.

In the grand scheme of things, this is a minor issue. But clarifying it can
make life a little easier for CGI authors and client implementors.

--_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-
Chuck Shotton                             \
Assistant Director, Academic Computing     \   "Shut up and eat your
U. of Texas Health Science Center Houston   \    vegetables!!!"
cshotton@oac.hsc.uth.tmc.edu  (713) 794-5650 \
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-\-_-_-_-_-_-_-_-_-_-_-_-_-

Received on Tuesday, 29 November 1994 12:59:49 UTC