RE: revised recording proposal from Mandyam, Giridhar on 2012-12-01 (public-media-capture@w3.org from December 2012)

From: Mandyam, Giridhar <mandyam@quicinc.com>
Date: Sat, 1 Dec 2012 13:40:41 +0000
To: "Mandyam, Giridhar" <mandyam@quicinc.com>, Harald Alvestrand <harald@alvestrand.no>, "public-media-capture@w3.org" <public-media-capture@w3.org>
Message-ID: <CAC8DBE4E9704C41BCB290C2F3CC921A163584EA@nasanexd01h.na.qualcomm.com>
Sorry - I missed a critical step.  In 5. below, it is necessary to process the media container and decode the encoded data before combining the pre-pause and post-pause recordings.  That does constitute "media processing".  It seems rather cumbersome to have to deal in Javascript with media containers and encode/decode operations.  Maybe we should consider adding pause() and resume() to the Recording API, where upon invocation of pause() whatever media has been recorded up to that point can be returned to the application.

-----Original Message-----
From: Mandyam, Giridhar [mailto:mandyam@quicinc.com] 
Sent: Friday, November 30, 2012 12:16 PM
To: Harald Alvestrand; public-media-capture@w3.org
Subject: RE: revised recording proposal

I don't think this requirement results in any media processing or manipulating of data returned from the recording function, and I don't think that fulfilling this requirement is helped by having timesliced data to be returned from the Recording API.  So if I refer back to what I assume was the scenario that inspired this requirement (http://dvcs.w3.org/hg/dap/raw-file/tip/media-stream-capture/scenarios.html#conference-call-product-debate-multiple-conversations-and-capture-review):

"... Amanda misses several of the participant's discussion points in the minutes. She calls for a point of order, and requests that the participants wait while she catches up. Amanda pauses the recording, rewinds it by thirty seconds, and then re-plays it in order to catch the parts of the debate that she missed in the minutes. When done, she resumes the recording and the meeting continues. Toward the end of the meeting, one field agent leaves early and his call is terminated."

This would involve the following steps with Recording API AFAICT:

1. App invokes stopRecording (no pauseRecording in the current API).  Blob is returned to the ondataavailable callback.  ArrayBuffer is created from Blob using FileReader.  Typed array A is created from ArrayBuffer.
2. Amanda needs to play back the video.  App sets video.src = Blob.url.  For UA's that don't support Blob.url. the app may have to write out a file (FileSaver, I guess).  
3. Amanda has caught up.  She "resumes" recording, which involves a fresh call to MediaRecorder.record.
4. Call is over and stopRecording is invoked.  Blob returned.  In order to create a single recording, the pre-pause recording must be appended with the post-pause recording.  Current Blob constructor does not have an append method, so the Typed array A created in  Step 1 must be appended to another typed array created from the new blob. 
5. New blob created from final typed array, and blob can be written to file or whatever else that Amanda can do with the web app.

If my steps are correct (big if, of course), then the closest thing I can see to media processing having taken place is the appending of pre-pause and post-pause recordings.   But my interpretation could be incorrect as well.

-----Original Message-----
From: Harald Alvestrand [mailto:harald@alvestrand.no] 
Sent: Friday, November 30, 2012 11:17 AM
To: public-media-capture@w3.org
Subject: Re: revised recording proposal

On 11/29/2012 10:05 PM, Mandyam, Giridhar wrote:
> Please point out the requirements in http://dvcs.w3.org/hg/dap/raw-file/tip/media-stream-capture/scenarios.html that state that media processing be built into the recording function.
RM13.
>
> -----Original Message-----
> From: Timothy B. Terriberry [mailto:tterriberry@mozilla.com]
> Sent: Thursday, November 29, 2012 1:03 PM
> To: public-media-capture@w3.org
> Subject: Re: revised recording proposal
>
> Mandyam, Giridhar wrote:
>> I am sorry - I don't believe a recording API should be used to enable
>   > real-time processing.  I certainly do not think it should be used for any
>
> Well, this is the use case that Jim, Milan, and probably others are actually interested in (myself included), so I believe you may be in the minority in your belief. The current proposal suggests that both this use case and the file-at-once use case have a lot in common, and we'd be foolish not to take advantage of that.
>
>   > audio stream processing for ASR.  This is what WebAudio is for, and we should  > work with the Audio WG if their current specification is unsuitable for what  > you believe is required for speech recognition.  But we have a call next week  > - maybe we can discuss this further during that time.
>
> Encoding/decoding of audio belongs at the end-points of any processing graph, i.e., in MediaStreams, which are the domain of _this_ Task Force.
> To say nothing of the fact that a solution that only works for audio is pretty poor. But you can go venue shopping if you want. Let us know how that works out for you.
>
>
Received on Saturday, 1 December 2012 13:41:14 UTC