RE: approaches to recording from Young, Milan on 2012-11-12 (public-media-capture@w3.org from November 2012)

From: Young, Milan <Milan.Young@nuance.com>
Date: Mon, 12 Nov 2012 18:21:04 +0000
To: "Cullen Jennings (fluffy)" <fluffy@cisco.com>, "public-media-capture@w3.org" <public-media-capture@w3.org>
Message-ID: <B236B24082A4094A85003E8FFB8DDC3C1A4BF68E@SOM-EXCH04.nuance.com>
Since this last message, I discovered the TURN protocol builds a tunnel between the browser and TURN server.  A TCP tunnel would thus ensure reliable transmission even if the browser was natively transmitting media via RTP.

But in order for us to count this as a practical solution, we'd need to know that browsers intend to support a TCP-based TURN communication.  If the expected stacks will only support UDP transport, for example, then we shouldn't count this proposal as addressing our "remote transcription" use case.

I'd appreciate hearing from vendors on this issue.

Thank you  



> -----Original Message-----
> From: Young, Milan [mailto:Milan.Young@nuance.com]
> Sent: Monday, October 29, 2012 1:25 AM
> To: Cullen Jennings (fluffy); public-media-capture@w3.org
> Subject: RE: approaches to recording
> 
> Cullen, I would appreciate your elaboration on this suggestion.  In particular, I
> understand once data reaches the TURN server it might retransmit with any
> protocol, but I don't understand how the data reaches the server in a reliable
> manner.
> 
> Are you saying the TURN server can somehow demand retransmission of lost
> RTP packets originating from the browser client?  Can the server force the
> client to submit RTP-like data chunks over TCP?
> 
> Thank you
> 
> 
> > -----Original Message-----
> > From: Cullen Jennings (fluffy) [mailto:fluffy@cisco.com]
> > Sent: Saturday, October 27, 2012 11:50 PM
> > To: public-media-capture@w3.org
> > Subject: Re: approaches to recording
> >
> >
> > If the only way to reach your voice recognition engine is via a TURN
> > TCP relay, then PeerConnection ends up being reliable ... I have seen
> > this hack deployed with SIP devices. I'm not saying I like it, but it does work.
> >
> >
> > On Oct 11, 2012, at 10:02 , Stefan Hakansson LK
> > <stefan.lk.hakansson@ericsson.com> wrote:
> >
> > > Just a thought:
> > >
> > > - Would not the simplest solution (at least from an API perspective)
> > > be to add a reliable mode to MediaStreams when added to
> > > PeerConnection? Then MTI codecs etc. would already be discussed.
> > > Datachannels in PeerConnection can already be set up in realiable
> > > and unreliable mode, so it would fit quite nicely. (I don't know how
> > > much extra work it would be on the protocol/ietf side though.)
> > >
> > > Stefan
> > >
> > >
> > > On 10/11/2012 08:15 AM, Young, Milan wrote:
> > >> Yes, the speech industry prefers reliable transports, but an even
> > >> more essential request is encoded audio.  It wasn't clear to me
> > >> whether the "configurations" mentioned below would support the
> > >> ability to specify a codec.  If that ability is planned, what are
> > >> the thoughts in aligning our MTI recommendations with WebRTC?
> > >>
> > >> -Milan
> > >>
> > >> *From:*Jim Barnett [mailto:Jim.Barnett@genesyslab.com]
> > >> *Sent:* Wednesday, October 10, 2012 3:50 PM
> > >> *To:* Travis Leithead; SULLIVAN, BRYAN L
> > >> *Cc:* public-media-capture@w3.org
> > >> *Subject:* RE: approaches to recording
> > >>
> > >> I just want to observe that lossless streaming is what we (= the
> > >> contact center and speech industry) want for  talking to a speech
> > >> recognition system.  It would be ideal if PeerConnection supported
> > >> it.  Failing that, it would be nice if the Recorder supported it,
> > >> but in a pinch we figure that we can use the track-level API to
> > >> deliver buffers of speech data and let the JS code set up the
> > >> TCP/IP
> > connection.
> > >>
> > >> -Jim
> > >>
> > >> *From:*Travis Leithead [mailto:travis.leithead@microsoft.com]
> > >> <mailto:[mailto:travis.leithead@microsoft.com]>
> > >> *Sent:* Wednesday, October 10, 2012 5:36 PM
> > >> *To:* SULLIVAN, BRYAN L; Jim Barnett
> > >> *Cc:* public-media-capture@w3.org
> > >> <mailto:public-media-capture@w3.org>
> > >> *Subject:* RE: approaches to recording
> > >>
> > >> That's what RTCPeerConnection is for, right? Or are you wanting
> > >> loss-less streaming?
> > >>
> > >> *From:*SULLIVAN, BRYAN L [mailto:bs3131@att.com]
> > >> *Sent:* Wednesday, October 10, 2012 1:08 PM
> > >> *To:* Jim Barnett
> > >> *Cc:* public-media-capture@w3.org
> > >> <mailto:public-media-capture@w3.org>
> > >> *Subject:* RE: approaches to recording
> > >>
> > >> Jim,
> > >>
> > >> Other than local recording, the most interesting part of this to me
> > >> is the ability to stream the content (pre-mixed, or as a
> > >> multiplexed set of track streams) to an external resource (URI) for
> > >> recording or processing. Realtime streaming is needed for external
> > >> realtime / minimal delay processing of the captured content. Thus
> > >> the proposal 2 seems more suited and flexible.
> > >>
> > >> Thanks,
> > >>
> > >> Bryan Sullivan
> > >>
> > >>    *From:*Jim Barnett [mailto:Jim.Barnett@genesyslab.com]
> > >>    *Sent:* Wednesday, October 10, 2012 8:28 AM
> > >>    *To:* public-media-capture@w3.org <mailto:public-media-
> > capture@w3.org>
> > >>    *Subject:* approaches to recording
> > >>
> > >>    The upshot of yesterday's discussion is that there is interest in
> > >>    two different approaches to recording, so I'd like  to start a
> > >>    discussion of them.  If we can reach consensus on one of them, we
> > >>    can start to write things up in more detail.
> > >>
> > >>    /_Proposal 1._/   Place a recording API, similar to the one in the
> > >>    current proposal, on both MediaStream and MediaStreamTrack.  The
> > >>    Track-level API would work as in the current proposal and would be
> > >>    for media processing and applications that require detailed
> > >>    control.  The MediaStream API would record all tracks in the stream
> > >>    into a single  file and would be the API for at least simple
> > >>    recording use cases.  It would be up to the UA/container format to
> > >>    determine the error cases.  For example, if the application adds a
> > >>    track to a MediaStream while recording is going on, the action might
> > >>    or might not succeed, depending on whether the container format
> > >>    could handle the extra track.
> > >>
> > >>    The advantages of this approach are:  1) it's simple  to define 2)
> > >>    it makes the simple cases (single video stream) easy  3)  the
> > >>    application has to be prepared to deal with errors (e.g. lack of
> > >>    disk space) during recording anyways.  The disadvantage of this
> > >>    approach is that it makes behavior in complicated cases
> > >>    unpredictable.  Some UAs might be able to handle the addition and
> > >>    removal of tracks during recording while others did not.
> > >>
> > >>    /_Proposal 2. _/Create a separate Recorder object that would serve
> > >>    as a destination/sink for  MediaStream.  (I gather that there's
> > >>    already been a proposal along these lines - I'd appreciate it if
> > >>    someone would send me a copy.) The configuration of this object
> > >>    would determine what the MediaStream was allowed to do.  For
> > >>    example, the application would create a Recorder object designed to
> > >>    handle a certain number of audio and video tracks.  If the platform
> > >>    could not support that number, the error would be raised when the
> > >>    Recorder was created/configured, not once recording was underway.
> > >>    If creation of the Recorder succeeds but the MediaStream attempts to
> > >>    exceed the configured number of audio or video tracks, either an
> > >>    error would be raised or the extra Tracks would be ignored.
> > >>
> > >>    It would be easy to add other features to the Recorder.  For
> > >>    example, most recordings are stored as files.  The Recorder object
> > >>    could be configured with a  URL and automatically right the
> > >>    recording to that URL once it was finished.  (I suppose that it
> > >>    could also be configured to stream the recording to the URL as it
> > >>    was created.)  There could  also be a configuration item specifying
> > >>    whether multiple audio tracks were to be merged or recorded
> > >>    separately.    Most of these options should  probably be provided as
> > >>    a Dictionary at construction time since we would not want them
> > >>    changed while recording was going on.
> > >>
> > >>    There would need to be a distinct interface, probably similar to the
> > >>    existing proposal, to handle media capture and media processing at
> > >>    the Track level.
> > >>
> > >>    One objection to this approach has come from Travis, who notes that
> > >>    it allows overlapping/simultaneous recording of a stream, for which
> > >>    he sees no good use cases.  (On the other hand, it does address
> > >>    another of Microsoft's concerns, in that the UA will know before
> > >>    recording starts how many tracks there will be.)
> > >>
> > >>    Any other thoughts and comments?
> > >>
> > >>    -Jim
> > >>
> > >
> > >
> >
>
Received on Monday, 12 November 2012 18:21:50 UTC