Re: Cost analysis: (was: Getting to Consensus: CONTINUATION-related issues)

On Jul 19, 2014, at 8:39 PM, David Krauss <potswa@gmail.com> wrote:

> 
> On 2014–07–20, at 1:28 AM, Jason Greene <jason.greene@redhat.com> wrote:
> 
>> On Jul 19, 2014, at 2:31 AM, David Krauss <potswa@gmail.com> wrote:
>> 
>>> Enforced decoding is not a burden on the receiver as long as it implements streaming. An ideal receiver (the QOI essentially required by the current spec) can keep on receiving and forwarding/discarding beyond its limit without committing extra memory or undue CPU.
>> 
>> I argue the opposite is true.
>> If you look at a comparison of say a client that sends 1MB of compressed headers, with one intermediary, but with a 16KB frame limit:
>> 
>> The streaming discard approach has the highest overall cost in computation time for all parties. It also introduces latency since all other streams must wait until the stream has completed. Finally it consumes unnecessary network bandwidth. 
> 
> You’re optimizing a failure mode by eliminating functionality (and making it a more common case). The assumption is that the client knows more about the acceptability of its request than the network protocol or the proxy. If the data gets through to the origin, it will probably do something useful.
> 
> It’s not a good assumption that lots of headers are just obnoxious and therefore an attack. Header size limit declaration is not realistically going to mitigate DoS.

That’s certainly not my point, so maybe I just haven’t explained all that well. The whole benefit to A is actually simplifying the common case which is headers < 16KB. That is why I like that option. The limit is cooperative, so obviously it can’t prevent a DOS attack. It does however help determine bad actor, which can be useful in DOS detection code. 

The main reason I have been behind so many proposals involving a length is that my primary concern with h2-13 is that continuations encourage HOL blocking in proxies from compliant actors, and lengths gives us a way to prevent that. There have been other proposals which would also solve the problem (namely allowing interleaving of continuations), but they were rejected.
> 
>>> If we can agree that GOAWAY on excessive headers is good enough for simple implementations,
>> 
>> Dropping the connection is somewhat tolerable for a client to origin topology. However it negatively impacts user experience. It’s problematic when you have intermediaries since a dropped connection potentially affects more traffic than that initiated by the user.
> 
> An unfulfilled request will already negatively impact UX, no?

Well I mean that it impacts other requests. So if you have some bad bit of JS code that triggers 413, now you get delays loading pages as well.

> 
> Yes, it would suck for intermediaries, but essentially I’m requiring that cilent-side proxies not be such simple implementations. For reverse proxies, there’s no real difference between a server GOAWAY and a proxy GOAWAY, so it’s feasible again.

Yeah I agree the first hop doing a GOAWAY is fine in that it only impacts the one client.

> 
>>> and streaming is reasonable to implement for any application that really doesn’t want to send GOAWAY, then the hard limit should remain at the receiver, with voluntary self-limiting by senders.
>> 
>> Voluntary self-limiting does indeed help the problem because an intermediary can prevent relaying and the subsequent GOAWAY.
> 
> The user-agent should self-limit by default, according to whether the application might call for large requests. Intermediaries should not attempt to constrain the client-server contract. Again, you’re trying to optimize a failure mode.

I see this argument made a lot in the various discussions. That somehow these proposals are favoring the 0.2%. It’s actually optimizing the 99.8% that can be negatively impacted by the 0.2%. This is a common goal in multiplexing protocol design, establishing some basic level of fairness. 

In any case, intermediaries already do constrain what can be sent over them. The exception of course being proxies that are little more than TCP tunnels.

> 
>>> A client application may know better that its particular server supports a higher limit. The best outcome requires sending the headers anyway and just seeing whether someone complains.
>> 
>> I don’t follow your argument here. A receiver is always going to be the one to know what its limits are unless it reports incorrect values, which would be a bug.
> 
> Most clients know their origin servers pretty well. The client and server applications are both part of the same whole. Limits are reported by intermediaries in the general case, and cannot be accurate.
> 
>> Well, there is the gigantic kerberos ticket use case, and those are certainly proxyable today. It’s hard to see how large headers are only appropriate across a single hop vs multiple hops.
> 
> Except all the proxies with header size limits. I didn’t say it’s impossible, anyway, I just said no evidence has been presented.

Fair enough. It’s an interesting question to answer as to how prevalent it actually is. I vaguelly recall Mike Bishop mentioning IIS is used with proxying and kerberos tickets. 

> 
>>> Limiting compressed data is nonsense. Users and applications can’t really reason about compressed sizes.
>> 
>> Sure they can:
>> https://github.com/http2/http2-spec/wiki/ContinuationProposals#dealing-with-compressed-limits
> 
> I don’t think our consensus is going to include rollback, sorry.

I don’t think it will either. Can you blame me for trying though? Just think of how nice and clean the spec would be with simple header frames…./me wakes up from dream :)

> 
> An application is a webpage, not a browser or an intermediary. A user in this context is a webmaster, not Mozilla.
> 
>> You can’t really determine which app to send the request to until the headers are processed, and partial processing isn’t reliable since we don’t have ordering rules on common selectable data. So the limit makes the most sense at a higher level than the application.
> 
> The application is usually at the “top of the stack” and the most abstract so it’s the highest level.
> 
> There’s already some mumblings about requiring routing information at the start of a request. This is pretty well required to make proxying/routing work at all, so I wouldn’t worry about particular special cases like this. Large requests with routing at the end are more likely to result in 431 than streamable large requests. Even if it doesn’t make it into the standard, it will be a de-facto requirement.

Yeah thats a good point. 
> 
>> This is quite different than an upload which involves passing the request to the application before the upload data is fully consumed, and the application is in control of that processing.
>> 
>> Anyway just to be clear I am fine with both approaches. I am not arguing against the B proposal. I just wanted to address some of the concerns with the client impact A.
>> 
>> --
>> Jason T. Greene
>> WildFly Lead / JBoss EAP Platform Architect
>> JBoss, a division of Red Hat
> 

--
Jason T. Greene
WildFly Lead / JBoss EAP Platform Architect
JBoss, a division of Red Hat

Received on Sunday, 20 July 2014 03:44:33 UTC