W3C home > Mailing lists > Public > ietf-http-wg@w3.org > April to June 2012

Re: Performance implications of Bundling and Minification on HTTP/1.1

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Tue, 26 Jun 2012 00:00:18 +0200
To: Henrik Frystyk Nielsen <henrikn@microsoft.com>
Cc: HTTP Working Group <ietf-http-wg@w3.org>, Howard Dierking <howard@microsoft.com>
Message-ID: <4dihu7hp68rq3rh1v9jkflbj3ed47fce8b@hive.bjoern.hoehrmann.de>
* Henrik Frystyk Nielsen wrote:
>[1] http://blogs.msdn.com/b/henrikn/archive/2012/06/17/performance-implications-of-bundling-and-minification-on-http.aspx

"We only looked at CSS and JS coming from the same DNS domain (i.e. for
digg.com we looked at anything under *.digg.com." As far as I can tell,
http://digg.com/ does not load any scripts from digg.com, most digg-
specific scripts seem to come from *.diggstatic.com instead. The same
goes for many of the other examples, http://www.bbc.co.uk/ loads from
*.bbcimg.co.uk and *.static.bbci.co.uk, http://www.huffingtonpost.com/
loads from *.huffpost.com (and a single one from *.huffingtonpost.com),
and so on. At least for the BBC I am quite sure this has not changed in
recent weeks, so it seems a different measure was used than the above.

I am not sure why the bundles are several kilobytes bigger than the sum
of the individual sizes, it would seem to take about one byte to conca-
tenate them (e.g., bloomberg.com has "js size (kb)" 408 and "js bundle
size (kb)" 410, which would make for a kilobyte of difference even if
there are rounding problems). If headers were counted, the size should
go down, as the bundle comes with less header overhead.

"In addition to removing white space, minification typically shortens
variable names and other identifiers and removes pieces that are not
used." I do not think changing variable names is "typical" for minifi-
cation. I am not sure what to look at due to the first issue above, but
the minification savings seem unreasonable; as far as comment removal
and white space normalization goes, the sites seem to mostly use mini-
fied scripts already, but they might not try to shorten variable names,
though the referenced blog posting explaining minification also doesn't
really mention that as an optimization. For CSS the results are closer,
but for, say, http://www.bbc.co.uk/ I get more like 13% minification
savings than 26% as in the blog posting (results vary with browsers and
other things, there are, for instance, "conditional comments" that hide
some references to external content from some browsers).

So overall the trend seems right, but this isn't very reproducible. It's
also somewhat disappointing to see that, for instance, digg.com loads an
old version of jQuery, and various jQuery extensions, "jScrollPane",
"AJAX Upload", "In-Field Label", and so on with full comments and unmi-
nified when loading the front page. Last year I wrote a tool that strips
several kilobytes http://bjoern.hoehrmann.de/pngwolf/ off google.com but
I guess collectively we are not quite there where we'd care about that.

(My general impression is that these kinds of optimizations are underde-
veloped because they are quite hard to automate, with HTTP compression
early on there had been many bugs in browsers and intermediaries and
caches in both, there were concerns about drive space costs and CPU con-
sumption, nowadays, to pick the example of PNG optimization, there is a
lack of "cooperation", tools that outperform my `pngwolf` tend to be
scripts that use multiple optimization tools, `pngwolf` selects the best
scanline filters, kzip/pngout has the best Deflate implementation as far
as size is concerned, various tools compete in the area of selecting be-
tween palette images and RGB/A images for losless recompression, there
are tools to optimize Huffman tables that work very good but happen to
be unfree and closed source so I can't link them into `pngwolf` and so
on. And, more importantly, the tools go largely unused. Google's mod_-
pagespeed for instance includes OptiPNG, but runs it at settings that
favour compression throughput over compression ratio, mainly, I suspect,
because amortizing the cost in form of some cache is hard to implement,
with proper cache invalidation and whatever else might be needed. In my
case of `pngwolf`, I was surprised even how trivial it was to come up
with a scanline filter selection that outperforms the only known one,
the one offered in the specification, and how even expensive commercial
tools fail to use even trivial heuristics. Overall, there is a lack of
tools to tell when your site is much slower/bigger/... than necessary,
a problem that starts with HTML and CSS being hard to parse properly.)
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 
Received on Monday, 25 June 2012 22:00:46 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 25 June 2012 22:01:00 GMT