Re: URI path starting with "//" from Zhong Yu on 2013-02-01 (ietf-http-wg@w3.org from January to March 2013)

From: Zhong Yu <zhong.j.yu@gmail.com>
Date: Fri, 1 Feb 2013 17:47:50 -0600
To: Phillip Hallam-Baker <hallam@gmail.com>
Cc: Julian Reschke <julian.reschke@gmx.de>, "Roy T. Fielding" <fielding@gbiv.com>, Bjoern Hoehrmann <derhoermi@gmx.net>, HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CACuKZqE3ff48HvwcEm+WEzGadKVp_1Cwp0CL7MqjETy5GKVgRA@mail.gmail.com>

On Fri, Feb 1, 2013 at 4:46 PM, Phillip Hallam-Baker <hallam@gmail.com> wrote:
>
>
> On Fri, Feb 1, 2013 at 5:37 PM, Julian Reschke <julian.reschke@gmx.de>
> wrote:
>>
>> On 2013-02-01 22:51, Phillip Hallam-Baker wrote:
>>>
>>>
>>>
>>> On Fri, Feb 1, 2013 at 4:08 PM, Roy T. Fielding <fielding@gbiv.com
>>> <mailto:fielding@gbiv.com>> wrote:
>>>
>>>     On Feb 1, 2013, at 12:34 PM, Julian Reschke wrote:
>>>
>>>      > On 2013-02-01 20:07, Bjoern Hoehrmann wrote:
>>>      >> * Julian Reschke wrote:
>>>      >>> On 2013-02-01 19:37, Zhong Yu wrote:
>>>      >>>> If user clicks a URL http://example.com//abc, the browser
>>>     should send
>>>      >>>>
>>>      >>>>      GET //abc HTTP/1.1
>>>      >>>>      Host: example.com <http://example.com>
>>>
>>>      >>>>
>>>      >>>> However the latest bis draft seems to forbid "origin-form" to
>>>     start with "//"
>>>      >>>> ...
>>>      >>>
>>>      >>> Is this a valid URI?
>>>      >>
>>>      >> http://www.websitedev.de/temp/rfc3986-check.html.gz says yes.
>>>     Per 3986:
>>>      >>
>>>      >>    URI           = scheme ":" hier-part [ "?" query ] [ "#"
>>>     fragment ]
>>>      >>    hier-part     = "//" authority path-abempty
>>>      >>    ...
>>>      >>    path-abempty  = *( "/" segment )
>>>      >>    ...
>>>      >>    segment       = *pchar
>>>      >
>>>      > Indeed. This appears to be an edge-case, but still...
>>>
>>>     Back in the really really early days of the Web, // would
>>>     indicate a gateway (essentially, an open proxy).  TimBL said that
>>>     the original idea was for many more layers than that, e.g.
>>>
>>>         ////first///second//third/path
>>>
>>>     as a form of routing.  Needless to say, that did not catch on.
>>>
>>>      > Roy, do you recall whether there's a reason why we would want to
>>>     rule out a path starting with "//"?
>>>
>>>     No, it is an accident of the transition to new URI ABNF and
>>>     should be raised as an issue.  There are several different ways to
>>>     fix it, depending on how lenient we want to be with parsing.
>>>
>>>     ....Roy
>>>
>>>
>>> There was another reason, it made it possible to use ftp and http URLs
>>> interchangeably.
>>>
>>> I remember Robert Cailliau raising the issue at the leaving party for
>>> TimBL from CERN and he admitted he didn't really have much of a good
>>> answer.
>>>
>>>
>>> I seem to recall that //domain/path was already in use at the time in
>>> NFS and AFS like things and the web was following the same scheme.
>>>
>>> If you were using the Web browser prior to there being HTTP you would
>>> have been using URIs of the form //afs.cern.ch/path
>>> <http://afs.cern.ch/path>
>>>
>>>
>>> So it would be natural to make them into:
>>>
>>> file://afs.cern.ch/path <http://afs.cern.ch/path>
>>>
>>>
>>>
>>> The main reason for keeping it at the time was that it provided a clear
>>> visual distinction between a URL and a URN and the difference between
>>> them was that a URL was an identifier that was relative to a DNS host
>>> name and a URN was anything that was not a URL.
>>>
>>> I think that was actually a good idea. The three schemes that had DNS
>>> domain names in them, file, http and ftp, all had the same
>>> method://domain:port/stem pattern.
>>> ...
>>
>>
>> Yes.
>>
>> But that doesn't seem to have anything to do with what we discuss here: a
>> *path* starting with "//".
>>
>> Best regards, Julian
>
>
> Ugh, hate that idea.
>
> The reason for having the double slash on the host name was so that it was
> possible to distinguish URI fragments that were relative to a scheme "//" ,
> a host "/" or a directory (anything else).
>
> Could be wrong but I would be very surprised if existing code didn't barf up
> a path starting //. I certainly think it should do.

I've heard web apps that use "//" paths for their "special" urls like
http://example.com//admin, and my experiments with it didn't show any
problems. (Though, one cannot write in html <a href="//path">, it
means something else.)

Zhong Yu

Received on Friday, 1 February 2013 23:48:18 UTC