- From: Roberto Peon <grmocg@gmail.com>
- Date: Tue, 22 Jan 2013 13:54:18 -0800
- To: Willy Tarreau <w@1wt.eu>
- Cc: "William Chan (?????????)" <willchan@chromium.org>, James M Snell <jasnell@gmail.com>, Nico Williams <nico@cryptonector.com>, "ietf-http-wg@w3.org" <ietf-http-wg@w3.org>
- Message-ID: <CAP+FsNcm_VBOsbptkLoOQXfgM-xAfYiZuqZusDm2YkoiszUfxA@mail.gmail.com>
The thing that isn't in delta, etc. already is the idea of 'rooting' the path space with the single request (which I like, but... it is subject to the CRIME exploit if path-prefix grouping is done automatically by the browser (instead of being defined by the content-developer)). IF we take your proposal for eliminating much of the common-path prefix and ensure that it isn't subject to CRIME, that is a winner in any scheme. -=R On Tue, Jan 22, 2013 at 1:51 PM, Roberto Peon <grmocg@gmail.com> wrote: > You've described server push+stateful compression (delta) pretty closely > there ('cause that is what we get when we combine them, without requiring > web page writers to change how they write their pages...)! :) > > With server push, you can do one request, many responses (except that you > can also cache and cancel and prioritize them, unlike bundling or inlining > or pipelining, which has nasty head-of-line blocking and infinite buffering > requirements... bleh). > With the delta compressor, you can define 'header groups', which allow you > to do exactly what you just described. The default implementation as it > exists today just guesses at the groupings by examining the hostname, but > that is a very naive approach-- splitting based on cookies and other > repeated fields makes the most sense. > > The biggest hurdle, at least in my opinion, to usage of the new features > is how much effort the content writers have to put in to change their > content (basically never happens), or change their knowledge of the best > practices (also difficult :( ). The best solution (again in my opinion), is > one where the optimizations can be done automatically (while not > necessarily perfectly, close enough :) ) are, thus freeing ourselves from > both categories... > > -=R > > > On Tue, Jan 22, 2013 at 1:27 PM, Willy Tarreau <w@1wt.eu> wrote: > >> Hi William, >> >> On Tue, Jan 22, 2013 at 12:33:37PM -0800, William Chan (?????????) wrote: >> > From the SPDY whitepaper >> > (http://www.chromium.org/spdy/spdy-whitepaper), we note that: >> > "Header compression resulted in an ~88% reduction in the size of >> > request headers and an ~85% reduction in the size of response headers. >> > On the lower-bandwidth DSL link, in which the upload link is only 375 >> > Kbps, request header compression in particular, led to significant >> > page load time improvements for certain sites (i.e. those that issued >> > large number of resource requests). We found a reduction of 45 - 1142 >> > ms in page load time simply due to header compression." >> > >> > That result was using gzip compression, but I don't really think >> > there's a huge difference in PLT between stateful compression >> > algorithms. That you use stateful compression at all is the biggest >> > win, since as Mark already noted, big chunks of the headers are >> > repeated opaque blobs. And I think the wins will only be greater in >> > bandwidth constrained devices like mobile. I think this brings us back >> > to the question, at what point do the wins of stateful compression >> > outweigh the costs? Are implementers satisfied with the rough order of >> > costs of stateful compression of algorithms like the delta encoding or >> > simple compression? >> >> I agree that most of the header overhead is from repeated headers. >> In fact, most of the requests we see for large pages with 100 objects >> contain many similar headers. I could be wrong, but I think that browsers >> are aware about the fact that they're fetching many objects at once in >> most situations (eg: images on an inline catalogue). >> >> Thus maybe we should think a different way : initially the web was >> designed to retrieve one object at a time and it made sense to have >> one request, one response. Now we have much more contents and we >> want many objects at once to load a page. Why now define that as the >> standard way to load pages and bring in the ability to load *groups* >> of objects ? >> >> We could then send a request for several objects at once, all using >> the same (encoded) headers, plus maybe additional per-object headers. >> The smallest group is one object and works like today. But when you >> need 10 images, 3 CSS and 2 JS, maybe it makes sense to send 1,2 or >> 3 requests only. We would also probably find it useful to define >> a base for common objects. >> >> We could then see requests like this : >> >> group 1 >> header fields ... >> base http://static.example.com/images/articles/20130122/ >> req1: GET corner-left.jpg >> req2: GET corner-right.jpg >> req3: GET center-banner.jpg >> req4: GET company-logo.png >> >> etc... >> >> Another big benefit I'm seeing there is that it's easy to switch from 1.1 >> to/from this encoding. And also intermediaries and servers will process >> much less requests because they don't have to revalidate all headers each >> time. The Host header would only be validated/rewritten once per group. >> Cookies would be matched once per group, etc... >> >> It would be processed exactly like pipelining, with responses delivered >> in the same order as the requests. Intermediaries could even split that >> into multiple streams to forward some of them to some servers and other >> ones to other servers. Having the header fields and base URI before the >> requests makes that easy because once they're passed, you can read all >> requests as they come without the need to additionally buffer. >> >> When you have an ETag or a date for an object, its I-M-S/I-N-M values >> would be passed along with the requests and not the group. >> >> I think this should often be more efficient than brute compression and >> still probably compatible with it. >> >> What do you think ? >> >> Willy >> >> >
Received on Tuesday, 22 January 2013 21:54:46 UTC