W3C home > Mailing lists > Public > public-media-capture@w3.org > October 2012

RE: approaches to recording

From: Jim Barnett <Jim.Barnett@genesyslab.com>
Date: Wed, 17 Oct 2012 05:21:48 -0700
Message-ID: <E17CAD772E76C742B645BD4DC602CD8106DB5BBC@NAHALD.us.int.genesyslab.com>
To: "Timothy B. Terriberry" <tterriberry@mozilla.com>, <public-media-capture@w3.org>
Cc: <public-media-capture@w3.org>
Are there encodings that can handle dynamically adding tracks that do _not_ support streaming encoding?  Streaming encoding may well belong at the Track level, anyway.  I think that it's  primary purpose is to support media processing (including  speech recognition), and for those uses you want to get pure audio, pure video, etc.  

- Jim

-----Original Message-----
From: Timothy B. Terriberry [mailto:tterriberry@mozilla.com] 
Sent: Tuesday, October 16, 2012 9:23 PM
To: public-media-capture@w3.org
Cc: public-media-capture@w3.org
Subject: Re: approaches to recording

Robert O'Callahan wrote:
> One option would be to have two APIs to initialize the recorder: one 
> which takes a MediaStream, and one which takes a list of 
> MediaStreamTracks. The former has the possibility of new tracks being 
> added dynamically, and the latter does not. The UA could reject 
> requests if the format doesn't support the specified tracks, or if the 
> format doesn't support addition of tracks and the author passed in a MediaStream.

I think it would be helpful for the group to outline some of the capabilities of the formats that exist today.

The only container I'm aware of that _might_ be able to support dynamically adding tracks _and_ supports "streaming encoding" (by which I mean that the browser can hand you pieces of the data as it encodes them, and the application can reconstruct a valid file from them by concatenating them all afterwards) is MPEG Transport Streams. Support for MPEG TS, especially if you want it to do anything even slightly complicated, is universally bad (it is the only format that media developers can all agree is more difficult to deal with than Ogg).

Both Ogg and Matroska (WebM) can change the track list, but only by ending all currently active tracks and starting a new segment of the file with a completely new set of tracks. However, even though this is technically possible in the file format, almost no software actually supports it, except perhaps in the limited case where the number, types, and formats of the tracks in each segment do not actually change.

There are three other general-purpose containers I'm aware of that actually support streaming encoding: ASF, Matroska, and Ogg. ASF does not get used very much anymore, and I don't really know much about it, so I will ignore it. Although Matroska does support streaming encoding, it cannot include an index (because you can't store the index at the beginning and putting it at the end requires a pointer at the beginning to say where it is, which you also can't include without knowing how big the file will be in advance). I'm not aware of any software that can seek in Matroska files without an index except by linearly scanning through the entire file (this is, e.g., what Chrome does; Firefox refuses to seek in them entirely), with the sole exception of the Haali DirectShow filters available on Windows. Ogg files, by comparison, normally do not contain an index, so seeking in Ogg files produced this way is supported relatively well in comparison.

RIFF (WAV) can support streaming encoding, but only by lying about the size of the DATA chunk. Support for reading files of this type is hit-or-miss. There is no standard for it, so different applications lie in different ways, and even software we at Xiph have deployed and maintained for over a decade still occasionally runs into files which confound its attempt to handle this (as recently as a month or two ago). 
I am not aware of any attempt to support streaming encoding for RIFF in AVI (i.e., video), though some software can support playing (but not seeking in) AVI files without an index.

QuickTime (MP4) is completely non-streamable. The index is the _only_ thing in the file that tells the decoder where the packet boundaries are, and you can't write it out until you've seen all of the packets. 
HTTP live streaming is starting to use a "segmented MP4" format which chops data into separate MP4 files each 10 seconds long or so, but player support for files like that is extremely limited, the latency it introduces is quite significant, and using smaller chunks would add excessive overhead.

Received on Wednesday, 17 October 2012 12:23:09 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:26:12 UTC