W3C home > Mailing lists > Public > ietf-http-wg@w3.org > April to June 2013

Re: HTTP URI in the form of "http://example.com?query"

From: Zhong Yu <zhong.j.yu@gmail.com>
Date: Wed, 5 Jun 2013 11:06:24 -0500
Message-ID: <CACuKZqGeB-907EH5ns2g85_3=bZOmv19f3EJV1B36bM0=Jdd9w@mail.gmail.com>
To: Willy Tarreau <w@1wt.eu>
Cc: Julian Reschke <julian.reschke@gmx.de>, Roberto Peon <grmocg@gmail.com>, HTTP Working Group <ietf-http-wg@w3.org>
On Wed, Jun 5, 2013 at 12:36 AM, Willy Tarreau <w@1wt.eu> wrote:
> Hi Julian,
>
> On Tue, Jun 04, 2013 at 10:04:23AM +0200, Julian Reschke wrote:
>> On 2013-06-04 09:50, Roberto Peon wrote:
>> >A search for regular expression (or synonym) and url will bring up
>> >numerous examples which would be broken by this change.
>> >It is certainly not every one, but numerous, nonetheless.
>> >
>> >Here is one example:
>> >http://net.tutsplus.com/tutorials/other/8-regular-expressions-you-should-know/
>> >-=R
>> >...
>>
>> Yes, but what's the exact breakage except for one component not
>> processing that edge case? It's an edge case after all?
>
> Interesting case, at least it breaks haproxy's path extraction, which
> relies on 2616. When you need to check the path from a request, haproxy
> does this :
>
>    1) skip the scheme and "://"
>    2) skip user:pass@host:port
>    3) look for the first "/"
>    4) return everything from the first "/" to the first "?" or end of
>       the string.
>
> So "http://example.com?query=foo/bar" will return "/bar" as the path of
> the request instead of an empty string or "/". BTW, is "/" supposed to
> be the abspath here, or just something empty ? I'm asking because haproxy
> returns a pointer to the beginning of the string and a length, so if the
> response is "/", we don't have it in this request, so probably the best
> thing to do would be to "fix" the request to insert the "/" before "?".

It's probably a good idea to insert the "/"

http://tools.ietf.org/html/rfc3986#section-6.2.3

   In general, a URI that uses the generic syntax for authority with an
   empty path should be normalized to a path of "/".

http://tools.ietf.org/html/draft-ietf-httpbis-p1-messaging-22#section-2.7.3

   When not being used in
   absolute form as the request target of an OPTIONS request, an empty
   path component is equivalent to an absolute path of "/", so the
   normal form is to provide a path of "/" instead.

I still think this problem deserves a warning note in the spec for the
interest of interoperability. It is apparently something that human
implementers may overlook.

Zhong Yu


>
>> Me thinks it's better to be (a) consistent with generic URI parsing and
>> (b) what important components already do (UAs, http servers etc).
>
> I agree. Using a generic parser is important in that it avoids future
> incompatibilities as in the example above which was based on 2616.
>
> Best regards,
> Willy
>
Received on Wednesday, 5 June 2013 16:06:54 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 11:11:13 UTC