Lossless modes (Re: approaches to recording) from Harald Alvestrand on 2012-10-15 (public-media-capture@w3.org from October 2012)

From: Harald Alvestrand <harald@alvestrand.no>
Date: Mon, 15 Oct 2012 11:22:33 +0200
To: Jim Barnett <Jim.Barnett@genesyslab.com>
CC: public-media-capture@w3.org
Message-ID: <507BD5D9.1080602@alvestrand.no>

On 10/12/2012 05:39 PM, Jim Barnett wrote:
>
> Harald,
>
> The lack of real-time delivery is not normally an issue for speech 
> recognition systems, because they run many times faster than real 
> time, and can catch up quickly once the data is available.  So if the 
> delays are short enough, the user will not perceive them.  And if the 
> delays are longer, well... then speech recognition will take a long 
> time.  People are used to stuff being slow on the internet, aren't they?
>

Changing the subject, because this is a very different subject:

The problem with "just" adding "lossless mode" to a MediaStream 
attachment to a PeerConnection is that it requires replacing the whole 
protocol stack underneath that transport - the idea of 
somewhat-unreliable, but always-reasonably-fast, transmission is deeply 
embedded into the RTP/UDP protocol suite.

I don't even want to propose that the IETF takes on defining a 
corresponding protocol suite at this time. It's MUCH simpler (seen from 
my side as running-back-and-forth-between-W3C-and-IETF) to define a 
local recording format that doesn't lose any bits, but also has no 
fast-delivery expectations.

What does the current contact center and speech industry do when faced 
with SIP telephone systems?

> -Jim
>
> *From:*Harald Alvestrand [mailto:harald@alvestrand.no]
> *Sent:* Friday, October 12, 2012 11:35 AM
> *To:* public-media-capture@w3.org
> *Subject:* Re: approaches to recording
>
> On 10/11/2012 12:50 AM, Jim Barnett wrote:
>
>     I just want to observe that lossless streaming is what we (= the
>     contact center and speech industry) want for  talking to a speech
>     recognition system.  It would be ideal if PeerConnection supported
>     it.  Failing that, it would be nice if the Recorder supported it,
>      but in a pinch we figure that we can use the track-level API to
>     deliver buffers of speech data and let the JS code set up the
>     TCP/IP connection.
>
> Of course lossless streaming (truly guaranteed delivery) implies 
> non-real-time streaming (or, more formally, having to deal with the 
> possibility that delivery will be delayed beyond real-time), given 
> that the Internet is a lossy medium.
>
> To another thread: Yes, having the constructor for the recorder take a 
> MIME type parameter would imply that you set the codec to be used. I 
> think we all agree that the data coming out of a recording interface 
> is encoded.
>
>            Harald
>
>

Received on Monday, 15 October 2012 09:23:03 UTC