- From: Aryeh Gregor <Simetrical+w3c@gmail.com>
- Date: Mon, 6 Sep 2010 15:19:08 -0400
On Mon, Sep 6, 2010 at 4:14 AM, Philip J?genstedt <philipj at opera.com> wrote: > The Ogg page begins with the 4 bytes "OggS", which is what Opera (GStreamer) > checks for. For additional safety, one could also check for the trailing > version indicator, which ought to be a NULL byte for current Ogg. [1] [2] "OggS\0" as the first five bytes seems safe to check for. It's rather short, I guess because it's repeated on every page, but five bytes is long enough that it should occur by random only negligibly often, in either text or binary files. > For WebM, the first 4 bytes are the EBML header: the bytes 0x1A, 0x45, 0xDF, > 0xA3. [3] The EBML DocType in the header must be "webm". Since parsing the > EBML header is a little bit complicated, Opera (GStreamer) simply checks for > the string "webm" somewhere in the header. I've heard rumors that WebM files > are allowed to contain arbitrary garbage before the EBML header, but this is > something we happily ignore, i.e., such files would fail to play in Opera, > regardless of MIME type. I haven't encountered any such files yet, and think > that browsers should not support this "feature". > > [1] http://www.xiph.org/ogg/doc/framing.html#page_header > [2] http://www.xiph.org/ogg/doc/rfc3533.txt > [3] http://ebml.sourceforge.net/specs/ It looks like you could check for 0x1a 0x45 0xdf 0xa3 as the first four bytes, followed by 0x42 0x82 0x84 "webm" somewhere in the first 255 bytes or whatever. (0x42 0x82 is the DocType marker, and 0x84 is the length, encoded UTF-8 style: 1 for a one-byte length, 0000010 for the actual length.) That seems very safe. If WebM allows degenerate stuff that makes sniffing hard, we can just prohibit it in the WebM spec, I assume.
Received on Monday, 6 September 2010 12:19:08 UTC