- From: 陈智昌 <willchan@chromium.org>
- Date: Tue, 28 May 2013 16:25:39 -0700
- To: Roberto Peon <grmocg@gmail.com>
- Cc: James M Snell <jasnell@gmail.com>, HTTP Working Group <ietf-http-wg@w3.org>, Patrick McManus <mcmanus@ducksong.com>
- Message-ID: <CAA4WUYhOnocH7nxX=ZmzH8jyygF_JAaYzTezCWFXP1XdTUEgKg@mail.gmail.com>
Just to be clear, I don't feel too strongly here. I do want to address a point as I feel my previous point was lost. On Tue, May 28, 2013 at 1:12 PM, Roberto Peon <grmocg@gmail.com> wrote: > responses inline > > > On Tue, May 28, 2013 at 12:16 PM, William Chan (陈智昌) < > willchan@chromium.org> wrote: > >> On Tue, May 28, 2013 at 11:50 AM, James M Snell <jasnell@gmail.com>wrote: >> >>> On Tue, May 28, 2013 at 11:41 AM, Roberto Peon <grmocg@gmail.com> wrote: >>> > As a reverse proxy, I've seen properties for which 4k writes/reads >>> were too >>> > small and induced latency increases. >>> > >>> >>> I haven't played with this part too much yet but this is my general >>> suspicion also. >>> >> >> Can you guys clarify this in more detail? Specifically, where the latency >> comes from. I have ideas, but I'd rather than an authoritative explanation. >> > > It always comes down to the cost of the context switches (i.e. syscalls) > and the locking that must be done in the lower layers of the IO stack. > Thanks for the clarification, I suspected it was the write()/read() cost, which I assume is what you mean by syscall. > > >> >>> >>> > Admittedly, frame size doesn't have to be the same as read/write size, >>> but >>> > it certainly does encourage that implementation (which is, I think, the >>> > point of smaller max frame size that you proposed). >>> >> >> You're right that it does encourage that implementation. Just like a >> larger length encourages just naively breaking up frames into that max >> frame size and thus hurt responsiveness. Which one is likelier to cause >> worse overall "performance" (I know this is vague, since people care about >> different aspects of perf)? What we want to do is have the most reasonable >> default behavior, with the ability for performant implementations to tune >> without unreasonable difficulty. I believe we're mostly focusing here on >> optimizing the naive implementations, not the highly tuned implementations. >> > > Remember that I'm the one who proposed the smaller max frame size in the > first place (now a fair while ago)? :) > I don't believe I've said anything that would imply I forgot that :) > My sweet-spot number was 16k, as I knew that I could saturate a 10G nic > with 16k frames/writes and have enough CPU left over to do some actual > work. The amount of overhead goes up more than linearly with the decrease > in frame size thanks to contention, etc. > I think you miss my point. Please correct me if I'm wrong, but I think you're saying that for your server, 16k was the right choice for write()s. write() sizes don't need to be tied to actual frame size, but of course that's what a naive implementation would do. And again, I think we should pick a max frame size that results in reasonable behavior for naive implementations/deployments. And I think the highly performant implementations will want to write their code in a way that decouples frame size from write() size, and will pick the optimal write() size given the tradeoffs. > > >> >> >>> > >>> > I propose we keep the 16 bit frame size and instead allow the (now >>> > negotiated setting of) max frame size to default to 12 bits worth, >>> with that >>> > going upwards out downwards when a settings frame arrives from the >>> other >>> > side indicating it's max receive size. HK >>> > >>> >>> Honestly, I'd prefer to do away with frame size negotiation altogether >>> because of the potential for path mtu style issues. Keeping the 16-bit >>> size for now with strong encouragement (SHOULD, perhaps?) for keeping >>> sizes around 12-bit lengths for the most common cases seems like the >>> right approach. >>> >>> -- James >>> >> > Unlike TCP/IP, max frame size is a point-to-point thing, as the primitive > we mux is streams, not frames. Frames are the way we accomplish the muxing. > Why would there be any path MTU like thing? > > -=R > > >> >>> > This would give the best chance that the code would be written in such >>> a way >>> > as to adapt with the times as they change. >>> > -=R >>> > >>> > On May 28, 2013 10:01 AM, "William Chan (陈智昌)" <willchan@chromium.org> >>> > wrote: >>> >> >>> >> Can you clarify what you mean by a documented performance metric for >>> >> non-browser use cases? I don't think Patrick said anything browser >>> specific. >>> >> He provided some serialization latency numbers and noted that they >>> are high >>> >> enough to impact responsiveness. And then he provided numbers on >>> overhead. >>> >> >>> >> I, for one, find the responsiveness argument compelling for browsers. >>> I'm >>> >> not completely sure 0.2% is low enough overhead for everyone, but I >>> wouldn't >>> >> complain about it. And in absence of complaints, I guess I'd support >>> moving >>> >> forward with only 12 bits for length. >>> >> >>> >> >>> >> On Tue, May 28, 2013 at 9:22 AM, James M Snell <jasnell@gmail.com> >>> wrote: >>> >>> >>> >>> Currently, my only challenge with this is that, so far, we have not >>> >>> seen any documented performance metrics for non-browser based uses. >>> >>> .That said, I don't really have the time currently to put together a >>> >>> comprehensive set of such metrics so it wouldn't be polite of me to >>> >>> insist on them ;-) ... perhaps for now we ought to keep the 16-bit >>> >>> size but include a recommendation about not exceeding 12-bits, then >>> >>> see what more implementation experience does for us. >>> >>> >>> >>> On Tue, May 28, 2013 at 7:20 AM, Patrick McManus < >>> mcmanus@ducksong.com> >>> >>> wrote: >>> >>> > Hi All, >>> >>> > >>> >>> > I've been looking at a lot of spdy frames lately, and I've noticed >>> what >>> >>> > I >>> >>> > consider a common implementation problem that I think a good http/2 >>> >>> > spec >>> >>> > could help with. I'm commonly seeing frames large enough to >>> interfere >>> >>> > with >>> >>> > effective prioritization. I've seen this from at least 3 different >>> >>> > servers. >>> >>> > >>> >>> > The HTTP/2 draft has a max frame size of 16 bits, which is a huge >>> >>> > improvement from spdy's 24. I propose we reduce it further to 12. >>> (i.e. >>> >>> > 4096 >>> >>> > bytes). >>> >>> > >>> >>> > The muxxed approach of multiple streams onto one connection done in >>> >>> > HTTP/2 >>> >>> > has great advantages, but the one downside of it is that it creates >>> >>> > head of >>> >>> > line blocking problems between those streams dictated by frame >>> >>> > granularity. >>> >>> > With small frames this is pretty manageable, with extremely large >>> ones >>> >>> > we've >>> >>> > recreated the same head of line problems that HTTP/1 pipelines >>> have. >>> >>> > The >>> >>> > server needs to be able to respond quickly to higher priority >>> events >>> >>> > (including cancellations) and once it has written a frame header >>> to the >>> >>> > wire >>> >>> > it is committed to the entire frame for how ever long it takes to >>> >>> > serialize >>> >>> > it. IMO the shorter that time, the better. >>> >>> > >>> >>> > Our spec can help implementations do the right thing here by >>> limiting >>> >>> > the >>> >>> > max frame size to 12 bits. >>> >>> > >>> >>> > It takes 500msec to serialize 64KB at 1Mbit/sec... 125msec at >>> >>> > 4Mbit/sec. >>> >>> > Those are some pretty notable task-switch times. Dropping the >>> frame to >>> >>> > 4096 >>> >>> > cuts them to 32msec and 8 msec.. that's much more responsive, at >>> the >>> >>> > cost of >>> >>> > 120 extra bytes of transfer (< 1msec at 1Mbit/sec). >>> >>> > >>> >>> > In general - the smaller the better as long as the overhead >>> doesn't get >>> >>> > to >>> >>> > be too large. At 8 in 4096 (~.2%) I think that's acceptable. Its >>> >>> > roughly the >>> >>> > same overhead as a VLAN tag. >>> >>> > >>> >>> > Obviously this makes a continuation bit for control frames >>> absolutely >>> >>> > mandatory, but I think we're already in that spot with 16 bit frame >>> >>> > lengths. >>> >>> > >>> >>> > -Patrick >>> >>> > >>> >>> > >>> >>> >>> >> >>> > >>> >> >> >
Received on Tuesday, 28 May 2013 23:26:10 UTC