RE: MediaStream Recording : record() method from Mandyam, Giridhar on 2013-08-28 (public-media-capture@w3.org from August 2013)

From: Mandyam, Giridhar <mandyam@quicinc.com>
Date: Wed, 28 Aug 2013 21:01:10 +0000
To: Martin Thomson <martin.thomson@gmail.com>
CC: Harald Alvestrand <harald@alvestrand.no>, "public-media-capture@w3.org" <public-media-capture@w3.org>
Message-ID: <CAC8DBE4E9704C41BCB290C2F3CC921A1653E995@nasanexd01h.na.qualcomm.com>

Hi Martin,
Thanks for the response, but I am still not understanding the point being made.

Maybe I should state in a different manner the  original point which I was hoping to make:  in absence of a TCP-based PC the most suitable option to meet the RT speech recognition use case was timesliced data being returing from record().   This has been discussed before on this list - see http://lists.w3.org/Archives/Public/public-media-capture/2012Nov/0076.html.  

> Harald's point was that a real-time application will observe the delays resulting from packet loss and adjust what it sends accordingly.

If the argument is that the existing UDP-based PC is sufficient to achieve loss-resilience for RT speech based on app-layer codec rate adaptation, I am not completely sold.  RT-speech, where data rates can be relatively low (e.g. ~4 kbps for EVRC-B encoding, if supported by the platform), may not have a lot of room to go much lower.    But I will defer to you on this topic, as your experience w/Skype may have proven otherwise.

>  A straight media recorder might just let the buffers back up a little more.

> Again, the media recorder is going to buffer until it runs out of space.

Do you mean if the MediaRecoder implementation ignores the timeslice?  If the MediaRecorder implementation is honoring the timeslice argument, why would it buffer continuously?

-Giri

-----Original Message-----
From: Martin Thomson [mailto:martin.thomson@gmail.com] 
Sent: Wednesday, August 28, 2013 1:38 PM
To: Mandyam, Giridhar
Cc: Harald Alvestrand; public-media-capture@w3.org
Subject: Re: MediaStream Recording : record() method

On 28 August 2013 13:28, Mandyam, Giridhar <mandyam@quicinc.com> wrote:
> I think it depends on what you consider “lossless”, or at least 
> sufficiently high quality for RT speech recognition (which was the 
> main use case for time-slicing on the record method).

Harald's point was that a real-time application will observe the delays resulting from packet loss and adjust what it sends accordingly.  A straight media recorder might just let the buffers back up a little more.  One optimizes for latency, the other doesn't.

It gets more interesting when you get a routing flap or some other less-time-constrained problem.  That's when you might see a real-time implementation crank the send rate to zero.  Again, the media recorder is going to buffer until it runs out of space.

Both scenarios are probably completely workable for an application that doesn't regard latency as paramount, but they will have different characteristics that - in some cases - will surface.  Not all abstractions are equal, unfortunately.

Received on Wednesday, 28 August 2013 21:01:43 UTC