RE: approaches to recording from Jim Barnett on 2012-10-10 (public-media-capture@w3.org from October 2012)

From: Jim Barnett <Jim.Barnett@genesyslab.com>
Date: Wed, 10 Oct 2012 15:50:11 -0700
To: "Travis Leithead" <travis.leithead@microsoft.com>, "SULLIVAN, BRYAN L" <bs3131@att.com>
Cc: <public-media-capture@w3.org>
Message-ID: <E17CAD772E76C742B645BD4DC602CD8106CE02C9@NAHALD.us.int.genesyslab.com>
I just want to observe that lossless streaming is what we (= the contact
center and speech industry) want for  talking to a speech recognition
system.  It would be ideal if PeerConnection supported it.  Failing
that, it would be nice if the Recorder supported it,  but in a pinch we
figure that we can use the track-level API to deliver buffers of speech
data and let the JS code set up the TCP/IP connection.  

 

-          Jim

 

From: Travis Leithead [mailto:travis.leithead@microsoft.com] 
Sent: Wednesday, October 10, 2012 5:36 PM
To: SULLIVAN, BRYAN L; Jim Barnett
Cc: public-media-capture@w3.org
Subject: RE: approaches to recording

 

That's what RTCPeerConnection is for, right? Or are you wanting
loss-less streaming? 

 

From: SULLIVAN, BRYAN L [mailto:bs3131@att.com] 
Sent: Wednesday, October 10, 2012 1:08 PM
To: Jim Barnett 
Cc: public-media-capture@w3.org
Subject: RE: approaches to recording

 

Jim,

 

Other than local recording, the most interesting part of this to me is
the ability to stream the content (pre-mixed, or as a multiplexed set of
track streams) to an external resource (URI) for recording or
processing. Realtime streaming is needed for external realtime / minimal
delay processing of the captured content. Thus the proposal 2 seems more
suited and flexible.

 

Thanks,

Bryan Sullivan

 

	From: Jim Barnett [mailto:Jim.Barnett@genesyslab.com] 
	Sent: Wednesday, October 10, 2012 8:28 AM
	To: public-media-capture@w3.org
	Subject: approaches to recording

	 

	The upshot of yesterday's discussion is that there is interest
in two different approaches to recording, so I'd like  to start a
discussion of them.  If we can reach consensus on one of them, we can
start to write things up in more detail.

	 

	Proposal 1.   Place a recording API, similar to the one in the
current proposal, on both MediaStream and MediaStreamTrack.  The
Track-level API would work as in the current proposal and would be for
media processing and applications that require detailed control.  The
MediaStream API would record all tracks in the stream into a single
file and would be the API for at least simple recording use cases.  It
would be up to the UA/container format to determine the error cases.
For example, if the application adds a  track to a MediaStream while
recording is going on, the action might or might not succeed, depending
on whether the container format could handle the extra track.   

	 

	The advantages of this approach are:  1) it's simple  to define
2) it makes the simple cases (single video stream) easy  3)  the
application has to be prepared to deal with errors (e.g. lack of disk
space) during recording anyways.  The disadvantage of this approach is
that it makes behavior in complicated cases unpredictable.  Some UAs
might be able to handle the addition and removal of tracks during
recording while others did not. 

	 

	Proposal 2. Create a separate Recorder object that would serve
as a destination/sink for  MediaStream.  (I gather that there's already
been a proposal along these lines - I'd appreciate it if someone would
send me a copy.)  The configuration of this object would determine what
the MediaStream was allowed to do.  For example, the application would
create a Recorder object designed to handle a certain number of audio
and video tracks.  If the platform could not support that number, the
error would be raised when the Recorder was created/configured, not once
recording was underway.  If creation of the Recorder succeeds but the
MediaStream attempts to exceed the configured number of audio or video
tracks, either an error would be raised or the extra Tracks would be
ignored. 

	 

	It would be easy to add other features to the Recorder.  For
example, most recordings are stored as files.  The Recorder object could
be configured with a  URL and automatically right the recording to that
URL once it was finished.  (I suppose that it could also be configured
to stream the recording to the URL as it was created.)  There could
also be a configuration item specifying whether multiple audio tracks
were to be merged or recorded separately.    Most of these options
should  probably be provided as a Dictionary at construction time since
we would not want them changed while recording was going on.  

	 

	There would need to be a distinct interface, probably similar to
the existing proposal, to handle media capture and media processing at
the Track level.  

	 

	One objection to this approach has come from Travis, who notes
that it allows overlapping/simultaneous recording of a stream, for which
he sees no good use cases.  (On the other hand, it does address another
of Microsoft's concerns, in that the UA will know before recording
starts how many tracks there will be.)

	 

	Any other thoughts and comments?

	 

	-          Jim
Received on Wednesday, 10 October 2012 22:51:24 UTC