W3C home > Mailing lists > Public > ietf-http-wg-old@w3.org > September to December 1995

Re: partial URLs ? (was <p> ... </p>)

From: Arjun Ray <aray@pipeline.com>
Date: Wed, 20 Dec 1995 16:45:19 -0500 (EST)
To: "Daniel W. Connolly" <connolly@beach.w3.org>
Cc: Jon Wallis <j.wallis@wlv.ac.uk>, BearHeart/Bill Weinman <BearHeart@bearnet.com>, www-html@w3.org, http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <Pine.3.89.9512201603.A3574-0100000@alpha>


On Wed, 20 Dec 1995, Daniel W. Connolly wrote:

> I think there are two issues that are getting confused here:
> 	(1) whether it's OK to use ../../ in an HREF or SRC attribute
> 	in an HTML document,
> 	(2) whether it's OK to _send_ ../../ in the path field of
> 	and HTTP request.
> 
> (1) is cool, (2) is not.

Yup. And that's what the specs should say, I suppose. There's some stuff
in RFC 1738 and the HTTP spec about URIs and "absolute paths". On that
ground, we could say that anything with ".."  in it is non-compliant if
included in the Request Line. (Probably needs elabroation, though.)

> What's _not_ cool is to try to sidestep the processing of .. on the
>client side;  that is, to just combine the base and HREF into: 
> 
> 	http://www.foo.com/a/b/../gifs/btnhome3.gifs
> 
> (which is _not_ a well-formed HTTP url) and send:
> 
> 	GET /a/b/../gifs/btnhome3.gif HTTP/1.0
> 
> This is illegal because it is a potential secruity risk. Consider a server
> whose document root is /usr/local/etc/httpd/docs/ and a client who sends:
> 
> 	GET /../../../../etc/passwd HTTP/1.0
> 	Accept: text/plain
> 
> a naive server implementation might just do:
> 	fopen("/usr/local/etc/httpd/docs//../../../../etc/passwd")
> and give away a bunch of sensitive info.
> 
> In stead, any server that sees /../ in the HTTP path is supposed to
> issue a 403 Unauthorized response. (Is this in the HTTP specs somewhere?
> YIKES! I can't find it in draft-ietf-http-v10-spec-02.txt!!!

I think this is illegal simply because it's not a well-formed URL. The 
question, then, is what the server should do about it.

(1) The euphemism is "server tolerance of clients". The truth, of course, 
is buggy client software. As far as server tolerance goes, it could try 
to normalize the path. But even though RFC 1738 does allow for 
heirarchical interpretations of paths in some schemes (HTTP included), 
there's nothing to suggest that this path while heirarchical can *also* 
be assumed to be embedded in an encompassing heirarchy. That is ".." as 
"parent directory/component" is a valid transformation only up to the 
"root". Even on UNIX (the inspiration) the parent of "/" is "/".

So, GET /../../../../etc/passwd 
== GET /etc/passwd
== GET /usr/local/etc/httpd/docs/etc/passwd
--> HTTP/1.0 404 Not Found

is a compliant outcome.

(2) Since the url is illegal to start with, a server could also return a
status code to indicate "Protocol Error" or some other indication of
permanent failure. Some 4xx codes appear to have such an interpretation,
but to keep in line with the FTP/SMTP/NNTP style of code code
classifications, this should be a 5xx response. My favorite (taken from
nnrpd) would be

HTTP/1.0 500 What?

On the issue of security, the typical approach is *not* to clue an 
attacker in to the fact that a security breach was involved in the failure.
There's no need to give out such information, and requiring this kind of 
a reason in the spec would be a mistake, IMHO. "Syntax Error, You Dope" 
is just fine:-)


Regards,

Arjun 
Received on Wednesday, 20 December 1995 13:50:34 EST

This archive was generated by hypermail pre-2.1.9 : Wednesday, 24 September 2003 06:31:37 EDT