RE: MediaStream Recording : record() method

There is an outstanding bug for WebRTC to have a TCP fallback:  see https://www.w3.org/Bugs/Public/show_bug.cgi?id=20818.  It has not been closed yet, so I assume that a TCP-based PC is still on the table for WebRTC.  This should be sufficient for lossless real-time data streaming in my opinion, and satisfy the speech recognition use case.

The timeframe for when we'll see TCP-based PC implementations is an unknown (at least to me).

From: Greg Billock [mailto:gbillock@google.com]
Sent: Wednesday, August 28, 2013 10:40 AM
To: Robert O'Callahan
Cc: public-media-capture@w3.org
Subject: Re: MediaStream Recording : record() method



On Mon, Aug 26, 2013 at 6:46 PM, Robert O'Callahan <robert@ocallahan.org<mailto:robert@ocallahan.org>> wrote:
On Tue, Aug 27, 2013 at 12:36 PM, Greg Billock <gbillock@google.com<mailto:gbillock@google.com>> wrote:
First, I think the recording API should be tuned for latent behavior, so that the recording product can be stored to long-term storage. That is, an app shouldn't expect to be able to get real-time recording from the API. For use cases where that's important, we have PeerConnection.

Depends on what you mean by "real-time recording". Using MediaRecorder to losslessly stream data as close to real time as possible is a valid use-case --- our WebRTC people have told me that lossless PeerConnection is not going to happen anytime soon, if ever.

Real-time to me means "adapting lossy-ness to ensure minimum latency" -- that is, what PeerConnection does, and why it may will never support uniform compression methods.

On an unloaded high-end machine, I agree that the MediaRecorder ought to produce very consistent and relatively low-latency callbacks so the stream will end up being pretty real-time. Our plan in Chrome is to buffer through long-term storage to make handling the pressures of the recording process consistently and make constraints on the app using the API less stringent. That approach is optimized for non-real-time recording, which I think is fine, as that's the sweet spot for the API. The advice would be "if you're using MediaRecorder to essentially write your own PeerConnection, you're doing it wrong." On the other hand, if you have a good system and a good connection, you should be able to, for example, upload a recorded stream to cloud storage in quasi-real-time with good responsivity.




This means that not all timeslice values will be honored. We could look at language like the setTimeout() spec uses -- where timeout values set inside a nested setTimeout won't be honored if they are greater than 4ms.

Here we could use something more UA-specific, such that either the timeslice value is to be regarded as a hint, or just say up-front that sufficiently low values of |timeslice| won't be honored, and some UA-specific minimum will be used instead. It'd be good to mention in the spec that apps should not attempt to rely on the API for real-time behavior.

Concrete edit:

3. If the timeSlice argument has been provided, then once timeSlice milliseconds of data have been collected, raise a dataavailable event containing the Blob of collected data, and start gathering a new Blob of data. Otherwise (if timeSlice has not been provided), continue gathering data into the original Blob.

could become:

3. If the timeSlice argument has been provided, then once at least timeSlice milliseconds of data have been collected, or some minimum time slice imposed by the user agent, whichever is greater, raise a dataavailable event containing the Blob of collected data, and start gathering a new Blob of data. Otherwise (if timeSlice has not been provided), continue gathering data into the original Blob. Callers should not rely on exactness of the timeSlice value, especially if the timeSlice value is small. Callers should consider timeSlice as a minimum value.

That sounds reasonable to me.
Shouldn't this be a "should" requirement? The UA should maintain fidelity to the inputs, but suppose an app configures the recorder to accept multiple superimposed audio tracks into a format that doesn't support such superposition. That seems like a pretty nice use case (web-based audio editing) that apps would want to deliberately support.

The original tracks make it sound like playback would produce either the same JS objects that went in, or ones with the same bits. I don't think this is the intended interpretation. Perhaps a better interpretation is that the recorder should represent all recorded tracks in the output. (Not, for example, just the first track in a set.)

Concrete edit:

"The UA should record the MediaStream in such a way that all compatible Tracks in the original are represented at playback time. The UA should do so in the highest fidelity to the input track composition which it can, given recording options and output format."

That sounds reasonable to me too.

Rob
--
Jtehsauts  tshaei dS,o n" Wohfy  Mdaon  yhoaus  eanuttehrotraiitny  eovni le atrhtohu gthot sf oirng iyvoeu rs ihnesa.r"t sS?o  Whhei csha iids  teoa stiheer :p atroa lsyazye,d  'mYaonu,r  "sGients  uapr,e  tfaokreg iyvoeunr, 'm aotr  atnod  sgaoy ,h o'mGee.t"  uTph eann dt hwea lmka'n?  gBoutt  uIp  waanndt  wyeonut  thoo mken.o w

Received on Wednesday, 28 August 2013 18:09:47 UTC