Notes on `pngwolf` and PNG optimization (draft) from Bjoern Hoehrmann on 2011-03-16 (www-archive@w3.org from March 2011)

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Wed, 16 Mar 2011 04:42:35 +0100
To: www-archive@w3.org
Message-ID: <7v50o6tsgk3lgeskn93g0ceku0voa81h2b@hive.bjoern.hoehrmann.de>
Hi,

  These are some results of an evaluation of `pngwolf` with respect to
its ability to reduce the file size of PNG images. As input I used Cuty-
Capt, <http://cutycapt.sf.net/>, to visit all "Alexa 1000" web sites and
store all the resources it fetched. Images in the set were roughly
distributed as follows (based on the Content-Type header with junk
omitted):

   5214 image/gif
   4609 image/jpeg
   3404 image/png

The high number of GIf images seemed surprising, so the first thing I
did was turning the GIFs into PNGs and ran a couple of PNG optimiters
over them and compared the file sizes, filtering out images that were
not actually GIFs and also animated GIF images as they don't compare. I
ran into a couple of difficulties with the optimizers, checking a few
samples they seemed to sometimes make bad decisions about how to store,
as in the PNG color type, certain images. Anyway, some of the data I've
collected in this process is available at

  http://www.websitedev.de/temp/gif-vs-png-2.txt.gz

Note that there is a bug in there (it's obvious) where I failed to re-
move a JPEG EXIF image labeled as image/gif from the lot. The columns
are the SHA-1 sum of the data, the size of the PNG, the size of the GIF,
their ratio, and which format won. We can learn from this that PNG does
not do very well with very very small images, but beats GIF all the time
for larger ones. If it weren't for the optimization problems I'd say, if
you have a non-animated GIF that's larger than 200 bytes, you should use
PNG instead.

I then turned attention to the resources labeled image/png. I used the
ones that do not contain "plte" which is the name of palette data chunks
and further filtered by file size, only images larger than 1 KB were se-
lected, and duplicates were removed aswell. Further 17 were removed be-
cause they weren't PNGs, then 40 because they are interlaced, and then a
final 28 that crashed `pngwolf` (it does not catch out of memory errors,
and they generated those by being malformed).

What `pngwolf` does at the moment is simply coalescing the IDAT chunks
in an image, then inflates the image data, changes the scanlines of the
image a bit, compresses it using zlib's Deflate implementation using the
level 5 setting, again and again, improving the filter selection as it
goes, and then saves the image data using 7-Zip's Deflate encoder.

So I ran `pngwolf` over all the images (only the PNGs from the second
set, not the converted GIFs from the first set I wrote about) and had it
collect some data. One thing `pngwolf` does is computing a heuristic se-
lection of filters before it launches a genetic algorithm to find better
combinations. This requires applying all possible filters to each scan-
line, combuting the absolute value of each signed byte, computing the
sum of all the bytes in a scanline, and then picking the filter that has
the lowest sum independently for each scanline.

So as the initial population for the genetic algorithm it uses:

  * All filters use the "none" filter
  * All filters use the "up" filter
  * ...
  * The heuristically derived filter
  * The filters in the original image

Plus a couple of random filters. As noted earlier, to approximate the
quality of a filter sequence the filter sequence is applied to the image
data and then the filtered data is compressed using zlib with the level
five setting. This allows me to explain the first couple of columns in
the table I've attached.

  * The first column contains the SHA-1 sum of the complete image file

  * The second column contains the number of bytes in the deflated IDAT
    data; multiple IDAT chunks are coalesced to determine this number.

  * The next five columns apply `zlib -5` to various filter sequences.
    In order they are the filter sequence in the original image; then
    the heuristically derived filter sequence; then applying the "None"
    filter to all scanlines, the "Paeth" filter to all scanlines, and
    then the best filter sequence `pngwolf` could come up as I'll note
    them in a moment.

The underlying idea is that if filter sequence X compresses better than
filter sequence Y using `zlib -5`, X will also compress better than Y
using some other `zlib` setting or some other Deflate encoder. This is
not always the case, but should be fairly close.

One thing that I've noticed in doing this is that many images on the web
use the heuristic filter even though applying the None filter to all the
scanlines actually makes the images compress better. Further, applying
the None filter to all scanlines, applying the Paeth filter to all, or
using the heuristic performs best (among the simple filters); applying
the Avg filter to all scanlines never performs better than applying any
of the other simple filter sequences, and all-Sub and all-Up are almost
all of the time worse than the other options. So I've included the worst
options from the table.

The next column is what `pngwolf` actually generates. It applies 7-Zip's
Deflate encoder using the equivalent of

  % 7za -mx=9 -mpass=15 -mfb=258 ...

to the image data. This is the maximum setting of the "use higher values
for better compression" settings that are runtime-configurable and that
are documented. I note that these do not always perform best, sometimes
a lower -mx setting compresses better than this extreme setting.

The `pngwolf` settings I've used where `--max-time=120` which means it
will not try much longer than 120 seconds to find better sequences, and
`--max-stagnate-time=30` which means it will abort its search after the
30 seconds specified in the option (whichever happens sooner).

So this column compared to the second specifies how well `pngwolf` does
in a more or less default setting. Then I wondered how other tools may
be able to do better. Generally speaking, only `pngout` can compete on
the minimum size scale against `advpng` (which uses 7-zip aswell) in the
casual tests I've made before this. It is said to have pretty much the
best Deflate implementation, if that is true, it should be able to make
the files even smaller than `pngwolf` can (without changing other things
about the image, like the color type).

The next column specifies the gain, if any, from applying `pngout` after
`pngwolf`. I've discarded entries where `pngout` performed worse than my
`pngwolf` utility, or where it changed the color type (for better or for
worse as `pngwolf` does not compete in that space).

It's still running, but two thirds through the trial it seems clear that
running `pngout` after `pngwolf` rarely does anything. As it is a closed
source tool it's not clear why, what I did find is that performance of
`pngout` somewhat depends on the input, it may for instance be unable to
compress its own output further if you used non-default settings on the
first pass.

To check the impact any changes to the filters `pngout` make I ran the
`pngout` output again through `pngwolf`. It's possible for instance that
it has a better method to pick better filters. `pngwolf` would then pick
those up and generate better output if `zlib -5` is a good estimator and
if the 7-Zip deflate encoder is better than `pngout`'s. That is what the
next two columns indicate giving size and ratio between the two.

Turns out this is not the case, there are number of cases where `pngout`
performs better even if applying `pngwolf` before and after `pngout`. So
either the estimator is bad or `pngout` has a genuinely better Deflate
implementation (as it is said to have, at least for some cases).

The next, blue-ish, column compares the best `pngwolf` output to the o-
riginal image as a percentage. This allows to quickly verify that it is
a generally applicable PNG optimizer. It is very rare that it produces
output that is bigger than the input (note that unlike other tools it
does not remove any chunks from the image other than redundant IDATs).

Ah, I forgot to mention, in the second `pngwolf` pass, I've changed the
settings a bit, in the second run it uses `--max-time=0` and `max-stag-
nate-time=10` so it continues searching until there is a period of ten
seconds where it is unable to improve the filter sequence.

Anyway, as running `pngout` after `pngwolf` provided little additonal
insight, I decided to run `pngout` directly on the original files, weed
out the ones where it changes the color mode, and then compare how it
performs without involving `pngwolf` at all. This allows to compare the
two as "stand-alone" optimizers.

This is still running and I've not yet checked preliminary results. What
I've found before in few-sample tests is that `pngwolf` can optimize a
number of images further than `pngout` can. The main two cases I've used
are an image on my recently re-designed homepage,

  http://bjoern.hoehrmann.de/hd.png

and a sprite map from Google who, I'm told, are "fanatic" about speed.
The `hd.png` of my making is optimized using everything I knew how to
throw at it (960753c562fcee4ad3f16cb6ec2f18028525944d was the sha1 at
the time of writing this). Google's sprite map was

  http://www.google.com/images/srpr/nav_logo37.png

which I was able to optimize a good big using `advpng` (like 3% when it
had sha1 969be47e9e272d3454983948033cc0a92dc133f7). Both images, after
all the optimizations, could be compressed further with `pngwolf` which
I took as confirmation of my approach during early testing.

Well, they are now at `b` in the sha1 order, so I will not that you can
get `pngwolf` at https://github.com/hoehrmann/pngwolf and put "draft" in
the subject (which I might revise completely when making a non-draft!),
and back this up on my collaborative blog.

Writing you some other day (brace for an Excel HTML export!),
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
Received on Wednesday, 16 March 2011 03:49:52 UTC