- From: Jim Barnett <Jim.Barnett@genesyslab.com>
- Date: Mon, 15 Oct 2012 06:24:12 -0700
- To: "Harald Alvestrand" <harald@alvestrand.no>
- Cc: <public-media-capture@w3.org>
- Message-ID: <E17CAD772E76C742B645BD4DC602CD8106CE0522@NAHALD.us.int.genesyslab.com>
Harald, Yes, if this complicates the stack, it won't be worth the effort. We will use an asynch API that delivers buffers of data (of configurable size) as they are available. That's somewhat different from the/a common recording case, where you just want the whole Blob when it's done (or maybe you want it written out to file without the JS code ever seeing it.) This asynch API is the same one we'd use for real-time media processing as well (for example, drawing a box around the bouncing ball). There seems to be some disagreement about whether this API is part of the recording API or a separate one. - Jim From: Harald Alvestrand [mailto:harald@alvestrand.no] Sent: Monday, October 15, 2012 5:23 AM To: Jim Barnett Cc: public-media-capture@w3.org Subject: Lossless modes (Re: approaches to recording) On 10/12/2012 05:39 PM, Jim Barnett wrote: Harald, The lack of real-time delivery is not normally an issue for speech recognition systems, because they run many times faster than real time, and can catch up quickly once the data is available. So if the delays are short enough, the user will not perceive them. And if the delays are longer, well... then speech recognition will take a long time. People are used to stuff being slow on the internet, aren't they? Changing the subject, because this is a very different subject: The problem with "just" adding "lossless mode" to a MediaStream attachment to a PeerConnection is that it requires replacing the whole protocol stack underneath that transport - the idea of somewhat-unreliable, but always-reasonably-fast, transmission is deeply embedded into the RTP/UDP protocol suite. I don't even want to propose that the IETF takes on defining a corresponding protocol suite at this time. It's MUCH simpler (seen from my side as running-back-and-forth-between-W3C-and-IETF) to define a local recording format that doesn't lose any bits, but also has no fast-delivery expectations. What does the current contact center and speech industry do when faced with SIP telephone systems? - Jim From: Harald Alvestrand [mailto:harald@alvestrand.no] Sent: Friday, October 12, 2012 11:35 AM To: public-media-capture@w3.org Subject: Re: approaches to recording On 10/11/2012 12:50 AM, Jim Barnett wrote: I just want to observe that lossless streaming is what we (= the contact center and speech industry) want for talking to a speech recognition system. It would be ideal if PeerConnection supported it. Failing that, it would be nice if the Recorder supported it, but in a pinch we figure that we can use the track-level API to deliver buffers of speech data and let the JS code set up the TCP/IP connection. Of course lossless streaming (truly guaranteed delivery) implies non-real-time streaming (or, more formally, having to deal with the possibility that delivery will be delayed beyond real-time), given that the Internet is a lossy medium. To another thread: Yes, having the constructor for the recorder take a MIME type parameter would imply that you set the codec to be used. I think we all agree that the data coming out of a recording interface is encoded. Harald
Received on Monday, 15 October 2012 13:25:54 UTC