RE: approaches to recording from Mandyam, Giridhar on 2012-10-10 (public-media-capture@w3.org from October 2012)

From: Mandyam, Giridhar <mandyam@quicinc.com>
Date: Wed, 10 Oct 2012 16:35:45 +0000
To: Jim Barnett <Jim.Barnett@genesyslab.com>, "public-media-capture@w3.org" <public-media-capture@w3.org>
Message-ID: <CAC8DBE4E9704C41BCB290C2F3CC921A162FB8A7@nasanexd01h.na.qualcomm.com>
For Proposal 1, having multiple record methods seems inefficient, and track-specific recording is not the way mobile OS's approach the problem (see http://developer.android.com/reference/android/media/MediaRecorder.html, http://developer.apple.com/library/ios/#documentation/AudioVideo/Conceptual/AVFoundationPG/Articles/04_MediaCapture.html#//apple_ref/doc/uid/TP40010188-CH5-SW2).  If my impressions are incorrect, please let me know.

According to http://dvcs.w3.org/hg/dap/raw-file/tip/media-stream-capture/scenarios.html#mc9 and http://dvcs.w3.org/hg/dap/raw-file/tip/media-stream-capture/scenarios.html#mc10, the requirement is that the UA support recording of multiple devices into both a single recording and separate recordings.  If the MediaStream recording API is defined in such a way to allow a track-specific setting of the video and audio sources, then it seems that you can satisfy the recording of a single track and multiple tracks with the same method.

Maybe something like (deriving from your proposal):

partial interface MediaStream : EventTarget  {
    void        record<file:///C:\Users\mandyam\AppData\Local\Microsoft\Windows\Temporary%20Internet%20Files\Content.Outlook\KQXXXISO\RecordingProposal.html#widl-record> (MediaStreamTrackList<http://dev.w3.org/2011/webrtc/editor/getusermedia.html#idl-def-MediaStreamTrackList> audioTracks, MediaStreamTrackList<http://dev.w3.org/2011/webrtc/editor/getusermedia.html#idl-def-MediaStreamTrackList> videoTracks, optional timeSliceType<file:///C:\Users\mandyam\AppData\Local\Microsoft\Windows\Temporary%20Internet%20Files\Content.Outlook\KQXXXISO\RecordingProposal.html#idl-timeSliceType> timeSlice);
    void        stopRecording<file:///C:\Users\mandyam\AppData\Local\Microsoft\Windows\Temporary%20Internet%20Files\Content.Outlook\KQXXXISO\RecordingProposal.html#widl-stoprecording> ();
    readonly attribute Boolean<file:///C:\Users\mandyam\AppData\Local\Microsoft\Windows\Temporary%20Internet%20Files\Content.Outlook\KQXXXISO\widl-recording> recording<file:///C:\Users\mandyam\AppData\Local\Microsoft\Windows\Temporary%20Internet%20Files\Content.Outlook\KQXXXISO\RecordingProposal.html>;
    readonly attribute EventHandler<file:///C:\Users\mandyam\AppData\Local\Microsoft\Windows\Temporary%20Internet%20Files\Content.Outlook\KQXXXISO\RecordingProposal.html> onrecording<file:///C:\Users\mandyam\AppData\Local\Microsoft\Windows\Temporary%20Internet%20Files\Content.Outlook\KQXXXISO\RecordingProposal.html#widl-onrecording>;
    readonly attribute EventHandler<file:///C:\Users\mandyam\AppData\Local\Microsoft\Windows\Temporary%20Internet%20Files\Content.Outlook\KQXXXISO\RecordingProposal.html> onstoprecording<file:///C:\Users\mandyam\AppData\Local\Microsoft\Windows\Temporary%20Internet%20Files\Content.Outlook\KQXXXISO\RecordingProposal.html#widl-onstoprecording>;
    readonly attribute EventHandler<file:///C:\Users\mandyam\AppData\Local\Microsoft\Windows\Temporary%20Internet%20Files\Content.Outlook\KQXXXISO\RecordingProposal.html> ondataavailable<file:///C:\Users\mandyam\AppData\Local\Microsoft\Windows\Temporary%20Internet%20Files\Content.Outlook\KQXXXISO\RecordingProposal.html#widl-ondataavailable>;
    Formats<file:///C:\Users\mandyam\AppData\Local\Microsoft\Windows\Temporary%20Internet%20Files\Content.Outlook\KQXXXISO\widl-dictionaryformats>     getRecordingOptions<file:///C:\Users\mandyam\AppData\Local\Microsoft\Windows\Temporary%20Internet%20Files\Content.Outlook\KQXXXISO\RecordingProposal.html#widl-getrecordingoptions> ();
    void        setRecordingOptions<file:///C:\Users\mandyam\AppData\Local\Microsoft\Windows\Temporary%20Internet%20Files\Content.Outlook\KQXXXISO\RecordingProposal.html#widl-setrecordingoptions> (Formats<file:///C:\Users\mandyam\AppData\Local\Microsoft\Windows\Temporary%20Internet%20Files\Content.Outlook\KQXXXISO\RecordingProposal.html#widl-dictionaryformats> ChosenFormats);
    void        requestData<file:///C:\Users\mandyam\AppData\Local\Microsoft\Windows\Temporary%20Internet%20Files\Content.Outlook\KQXXXISO\RecordingProposal.html#widl-requestData>();
};

Although the MediaStreamTrackList returned from gUM currently is read only, I assume that you can create a new MediaStreamTrackList that is a subset of what is returned.

I realize that this isn't elegant, given that the tracks passed to the record API in this case should be a subset of the tracks associated with the MediaStream.  Otherwise we could be hanging a record method on a MediaStream that records tracks that are not associated with that MediaStream.  Nevertheless, if this sticks in peoples' craw, then we may have to pursue something closer to Proposal 2.

If we go with Proposal 2, the justification for a track-specific record function seems harder to make in my opinion.  The recorder object should be able to take arbitrary video and audio sources.  The sources could be one track, or multiple tracks.  The MediaStreamTrackList object could be leveraged as well.

-Giri Mandyam, Qualcomm Innovation Center

From: Jim Barnett [mailto:Jim.Barnett@genesyslab.com]
Sent: Wednesday, October 10, 2012 8:28 AM
To: public-media-capture@w3.org
Subject: approaches to recording

The upshot of yesterday's discussion is that there is interest in two different approaches to recording, so I'd like  to start a discussion of them.  If we can reach consensus on one of them, we can start to write things up in more detail.

Proposal 1.   Place a recording API, similar to the one in the current proposal, on both MediaStream and MediaStreamTrack.  The Track-level API would work as in the current proposal and would be for media processing and applications that require detailed control.  The MediaStream API would record all tracks in the stream into a single  file and would be the API for at least simple recording use cases.  It would be up to the UA/container format to determine the error cases.  For example, if the application adds a  track to a MediaStream while recording is going on, the action might or might not succeed, depending on whether the container format could handle the extra track.

The advantages of this approach are:  1) it's simple  to define 2) it makes the simple cases (single video stream) easy  3)  the application has to be prepared to deal with errors (e.g. lack of disk space) during recording anyways.  The disadvantage of this approach is that it makes behavior in complicated cases unpredictable.  Some UAs might be able to handle the addition and removal of tracks during recording while others did not.

Proposal 2. Create a separate Recorder object that would serve as a destination/sink for  MediaStream.  (I gather that there's already been a proposal along these lines - I'd appreciate it if someone would send me a copy.)  The configuration of this object would determine what the MediaStream was allowed to do.  For example, the application would create a Recorder object designed to handle a certain number of audio and video tracks.  If the platform could not support that number, the error would be raised when the Recorder was created/configured, not once recording was underway.  If creation of the Recorder succeeds but the MediaStream attempts to exceed the configured number of audio or video tracks, either an error would be raised or the extra Tracks would be ignored.

It would be easy to add other features to the Recorder.  For example, most recordings are stored as files.  The Recorder object could be configured with a  URL and automatically right the recording to that URL once it was finished.  (I suppose that it could also be configured to stream the recording to the URL as it was created.)  There could  also be a configuration item specifying whether multiple audio tracks were to be merged or recorded separately.    Most of these options should  probably be provided as a Dictionary at construction time since we would not want them changed while recording was going on.

There would need to be a distinct interface, probably similar to the existing proposal, to handle media capture and media processing at the Track level.

One objection to this approach has come from Travis, who notes that it allows overlapping/simultaneous recording of a stream, for which he sees no good use cases.  (On the other hand, it does address another of Microsoft's concerns, in that the UA will know before recording starts how many tracks there will be.)

Any other thoughts and comments?


-          Jim
Received on Wednesday, 10 October 2012 16:36:17 UTC