- From: James M Snell <jasnell@gmail.com>
- Date: Wed, 21 Aug 2013 08:18:37 -0700
- To: Zhong Yu <zhong.j.yu@gmail.com>
- Cc: "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
The connection with what I described in that blog post is that many implementers have, apparently, take a number of liberal shortcuts in terms of how they parse and accept http methods. Node's parser, for instance, will immediately error out if it receives a method that doesn't start with an alpha character (which is reasonable). It also does not support the use of any non-alpha character other than "-" (dash). Their implementation also chooses not to support extension methods because the implementers feel there is no way of supporting an arbitrary unknown set of tokens in a performant way... and truthfully, because there is no upper bound on the length of the methods and because there is a significantly large value space, they do have a point. Now, I did say right up front that this is a fairly minor issue, and if it doesn't happen, so be it. But the restrictions I suggest ought to make it at least some degree easier for implementations to generically handle extension methods in a performant manner.. while at the same time more accurately reflecting the reality of what many implementers are already doing. On Wed, Aug 21, 2013 at 8:07 AM, Zhong Yu <zhong.j.yu@gmail.com> wrote: > On Tue, Aug 20, 2013 at 6:22 PM, James M Snell <jasnell@gmail.com> wrote: >> HTTPbis currently defines the request method as a "token" of unbounded-length. >> >> Specifically: >> >> tchar = "!" / "#" / "$" / "%" / "&" / "'" / "*" / "+" / "-" / "." / >> "^" / "_" / "`" / "|" / "~" / DIGIT / ALPHA >> token = 1*tchar >> method = token >> >> This definition is overly broad and does not reflect real world use >> [http://tools.ietf.org/html/draft-ietf-httpbis-method-registrations-12]. >> >> I propose that in HTTP/2 we tighten this definition up significantly >> and place an upper bound on the length a request method ought to be: >> >> UPPER = %x41-5A >> method = UPPER *20( UPPER / "_" / "-" ) >> >> This is obviously a strictly limited subset of what's allowed by the >> current definition. It limits the length of method names to no more >> than 20 characters, requires that methods be all uppercase, requires >> that methods always start with a letter and limits non-letter >> characters to the dash and underscore. The rule would be that all >> *newly registered* HTTP methods MUST conform to the new rule but >> implementations MAY choose to support the old definition if necessary >> for backwards compatibility. >> >> It's a fairly minor issue, yes, but tightening this up ought to make >> it easier for developers to create parsers that are both efficient >> *and* compliant [http://www.chmod777self.com/2013/08/sigh.html] > > I don't see how the bug mentioned in the blog has anything to do with > what you are proposing. It looks like node.js is accepting any "GE<*>" > as "GET" where <*> can be any octet. Maybe node.js was assuming that > the request has been validated by an upstream parser? > > Zhong Yu
Received on Wednesday, 21 August 2013 15:19:32 UTC