Re: partial URLs ? (was from William C. Cheng on 1995-12-21 (www-html@w3.org from December 1995)

From: William C. Cheng <william@cs.columbia.edu>
Date: Wed, 20 Dec 1995 19:33:15 -0500
To: mwm@contessa.phone.net (Mike Meyer)
Cc: www-html@w3.org
Message-Id: <199512210033.TAA01026@age.cs.columbia.edu>
| > I like Dan Connolly's response that a well-behaved Client should NOT
| > request any URL with ../ in it because it may get a 403 response.
| 
| I don't like that argument (and I didn't see it from Dan) - it's very
| Unix-centric, and doesn't generalize. After all, if you can't use some
| string in a URL because it MAY get a 403 response, then I can add a
| single line to my server config that would imply you shouldn't use any
| text string in a URL.

The point is that you CAN use the string in a URL, the browser should
do some processing before sending it to a server (just like a browser
should replace "&amp;" by "&" before sending it to the server (as discussed
in another message).

| What behavior did Dan (or you) recommend if I type in a URL with a
| "../" in it by hand? Not doing what the user asked you to to avoid
| vague security problems on someone else's machine is pretty clearly
| broken. Escaping the URL is acceptable, and might even produce the
| correct results.

Dan's message to www-html is included at the end.  I think he is
suggesting that a browser processes the "../" by collapsing the right
thing.  At the end of his message, seems to me that he is also suggesting
that may be it should go into the HTTP spec.
--
Bill Cheng // Guest at Columbia Unversity Computer Science Department
william@CS.COLUMBIA.EDU      ...!{uunet|ucbvax}!cs.columbia.edu!william
WWW Home Page: <URL:http://www.cs.columbia.edu/~william>


(Sorry if you have seen this already.)
--------------------------> included message <--------------------------
Resent-Message-Id: <199512201539.KAA27847@www19.w3.org>
Message-Id: <m0tSQNg-0002S3C@beach.w3.org>
To: Jon Wallis <j.wallis@wlv.ac.uk>
Cc: BearHeart/Bill Weinman <BearHeart@bearnet.com>, www-html@w3.org
Cc: http-wg@cuckoo.hpl.hp.com
Subject: Re: partial URLs ? (was <p> ... </p>) 
In-Reply-To: Your message of "Wed, 20 Dec 1995 11:31:57 GMT."
             <m0tSMkY-000oANC@ccug.wlv.ac.uk> 
Mime-Version: 1.0
Content-Id: <6337.819473076.1@beach.w3.org>
Date: Wed, 20 Dec 1995 10:24:36 -0500
From: "Daniel W. Connolly" <connolly@beach.w3.org>
Resent-From: www-html@w3.org
X-Mailing-List: <www-html@w3.org> archive/latest/2018
X-Loop: www-html@w3.org
Sender: www-html-request@w3.org
Resent-Sender: www-html-request@w3.org
Precedence: list
Content-Type: text/plain; charset="us-ascii"
Content-Length: 2397

In message <m0tSMkY-000oANC@ccug.wlv.ac.uk>, Jon Wallis writes:
>At 13:19 19/12/95 -0600, BearHeart/Bill Weinman wrote:
>>
>>At 10:40 am 12/19/95 -0800, Walter Ian Kaye wrote:
>>><A HREF="index.html"><IMG SRC="../gifs/btnhome3.gif" ALT="[Home]"
>border=1></A>
>>><A HREF="../map.html"><IMG SRC="../gifs/btnmap3.gif" ALT="[Index]"
>>
>>>(I'm gonna be changing the form and cgi soon, btw, cuz Lynx doesn't like
>>>partial URLs -- tho' Netscape handles this form perfectly.)
>>
>>   The problem with the parial URLs may be the "../" references. 
>>
>>   Some servers, and perhaps some browsers too, disallow them because 
>>they've been abused to get around security measures. 
>
>That really shouldn't be a problem if the system is set up right - but since
>so many systems are poorly set up in terms of security  I can believe it.

I think there are two issues that are getting confused here:
	(1) whether it's OK to use ../../ in an HREF or SRC attribute
	in an HTML document,
	(2) whether it's OK to _send_ ../../ in the path field of
	and HTTP request.

(1) is cool, (2) is not.

For example, if the example above was fetched from http://www.foo.com/a/b/c.html,
then to fetch the [Home] image, the client must combine the value of the HREF
attribute with the base URL as per RFC1808, yielding:

	http://www.foo.com/a/gifs/btnhome3.gif

To access the resource at that address, it makes a TCP connection to port 80
of www.foo.com, and sends:

	GET /a/gifs/btnhome3.gif HTTP/1.0
	Accept: image/*

What's _not_ cool is to try to sidestep the processing of .. on the client side;
that is, to just combine the base and HREF into:

	http://www.foo.com/a/b/../gifs/btnhome3.gifs

(which is _not_ a well-formed HTTP url) and send:

	GET /a/b/../gifs/btnhome3.gif HTTP/1.0

This is illegal because it is a potential secruity risk. Consider a server
whose document root is /usr/local/etc/httpd/docs/ and a client who sends:

	GET /../../../../etc/passwd HTTP/1.0
	Accept: text/plain

a naive server implementation might just do:
	fopen("/usr/local/etc/httpd/docs//../../../../etc/passwd")
and give away a bunch of sensitive info.

In stead, any server that sees /../ in the HTTP path is supposed to
issue a 403 Unauthorized response. (Is this in the HTTP specs somewhere?
YIKES! I can't find it in draft-ietf-http-v10-spec-02.txt!!!

HTTP-WG folks: this should be addressed in the HTTP 1.0 spec, no?

Dan
Received on Wednesday, 20 December 1995 19:34:23 UTC