Re: partial URLs ?

BearHeart/Bill Weinman (BearHeart@bearnet.com)
Wed, 20 Dec 1995 20:36:46 -0600


Date: Wed, 20 Dec 1995 20:36:46 -0600
Message-Id: <199512210236.UAA08041@primus.paranoia.com>
To: www-html@w3.org, http-wg@cuckoo.hpl.hp.com
From: BearHeart/Bill Weinman <BearHeart@bearnet.com>
Subject: Re: partial URLs ?


   Hi, Kids, 

   I wouldn't bother with this, except that since I sent the 
message that got this thread rolling I've seen over 150 messages 
in about 24 hrs. (I don't know the exact number, but it was over 
100 when I cleaned up this afternoon.) Most of them are duplicates 
or triplicates of the same message. (And not all are on this 
thread.)

   Part of the reason for that is that the thread is now being 
echoed to two lists that I subscribe to--but about a third of the 
traffic is because of the practice of replying to several individuals
(invariably one of them is me) as well as to the lists. 

   Could we try to bring the distribution of this thread back down to 
just the lists? It would help my frazzled nerves a bit. 

   <now back to our regularly-scheduled argument>

John Franks Wrote:
>As I recall the draft RFC for URL's specifies that certain characters
>(like space) are forbidden, certain (like '?') have special meaning
>and otherwise the "path" part of a URL is an opaque string (which, in
>particular, may have nothing to do with a path).  Neither '/' nor '.'
>are forbidden or have special meaning.  They do have special meaning
>*for some implementations* and no special meaning for others.
>Likewise the colon may have special meaning for some implementations
>and not for others.

   I think you're right that there is nothing about the "../" 
string that's in violation of URL-law. But then, I don't think 
a URL is a very exact science anyway <g>. 

>It would, of course, be quite reasonable for the HTTP spec to have
>a UNIX-centric warning to implementors that they should make this
>string illegal for their implementation (or risk the consequences).

   Yes, "/../" is a unixism, but the path part of a URL is inherently 
platform specific. I see URLs with "\" in them for DOS-type hosts, and 
"\..\" is just as much of a problem--maybe more because of the lack 
of permissions-bits in most DOSish OSs. The code I've seen that 403s 
these things checks for the ".." and that seems to be a pretty 
universal string for "go up a level in the file system", or do you 
know of an OS with more than 3 servers on the net that doesn't work 
that way? 

   (side note: MS has implemented "..." and "...." in Win95 for 
referencing up two- and three- levels respectively. I don't know about 
NT, but if it's not in there now it soon will be. A check for ".." 
would obviously catch this as well.)

   My bottom line here is that ".." in a path ought to be illegal 
in HTTP, perhaps with a notation to that effect in HTML. 


+----------------------------------------------------------------------+
 * BearHeart / Bill Weinman 
 * BearHeart@bearnet.com *            * http://www.bearnet.com/ *
 * Author of The CGI Book:    * http://www.bearnet.com/cgibook/ *
 * Trust everyone, but brand your cattle.