WebP versus RIFF from Bjoern Hoehrmann on 2011-03-01 (www-archive@w3.org from March 2011)

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Tue, 01 Mar 2011 18:41:30 +0100
To: www-archive@w3.org
Message-ID: <8r9qm6lp39uom07jkmm41ku57o53no1ov4@hive.bjoern.hoehrmann.de>
Hi,

  I tried to make an ad-hoc "WebP" encoder by wrapping a raw VP8 bit-
stream into a RIFF container according to the documentation available
at <http://code.google.com/speed/webp/docs/riff_container.html>. For
this I used FFmpeg ala

  % ffmpeg -i input.png -f rawvideo -vcodec libvpx output.vp8

And then I made a Perl script that prints out

  pack "a4Va4a4V", "RIFF", 12+$size, "WEBP", "VP8 ", $size;

where $size is the size of the .vp8 file. Further, if $size is uneven,
I append a 0x00 byte at the end. For my test image FFmpeg did create a
VP8 file with an uneven number of bytes.

According to "Multimedia Programming Interface and Data Specifications
1.0" issued by Microsoft and IBM in 1991:

  The basic building block of a RIFF file is called a chunk. Using C
  syntax, a chunk can be defined as follows:

  typedef unsigned long DWORD;
  typedef unsigned char BYTE;
  
  typedef DWORD FOURCC;  // Four-character code
  typedef FOURCC CKID;   // Four-character-code chunk identifier
  typedef DWORD CKSIZE;  // 32-bit unsigned size value
  
  typedef struct {         // Chunk structure
    CKID   ckID;           // Chunk type identifier
    CKSIZE ckSize;         // Chunk size field (size of ckData)
    BYTE   ckData[ckSize]; // Chunk data
  } CK;

  ...

  ckID
    A four-character code that identifies the representation of the
    chunk data data. A program reading a RIFF file can skip over any
    chunk whose chunk ID it doesn't recognize; it simply skips the
    number of bytes specified by ckSize plus the pad byte, if present.

  ckSize
    A 32-bit unsigned value identifying the size of ckData. This size
    value does not include the size of the ckID or ckSize fields or the
    pad byte at the end of ckData.

  ckData
    Binary data of fixed or variable size. The start of ckData is
    word-aligned with respect to the start of the RIFF file. If the
    chunk size is an odd number of bytes, a pad byte with value zero is
    written after ckData. Word aligning improves access speed (for
    chunks resident in memory) and maintains compatibility with EA IFF.
    The ckSize value does not include the pad byte.

Excluding the padding from the chunk size is necessary to ensure that
you can encode arbitrary chunk data. If you include the pad byte in the
chunk size, it becomes impossible to tell whether the pad byte is a pad
byte or part of the data. So obviously my code does not include padding
in the size fields.

I then tried to decode the result using `libwebp-0.1` which failed with
not much of an error message. After some debugging it turns out Google
thinks that "The RIFF specification requires that all chunks are
even-sized" and as a result their `libwebp-0.1` can not handle odd sized
VP8 streams, despite them being obviously allowed and necessary.

So, if you consider the original RIFF specificiation authoriative, the
padding that Google's tools need cannot be the proper RIFF padding, but
must be considered specific to the "VP8 " data encoding. I am not sure
what the VP8 specification says about the trailing 0x00 byte in a VP8
bitstream, but I would suspect it's an incomplete frame header...

regards,
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
Received on Tuesday, 1 March 2011 17:48:36 UTC