- From: Chia-Hung Tai <ctai@mozilla.com>
- Date: Thu, 30 Jul 2015 13:12:19 +0800
- To: Mathieu Hofman <Mathieu.Hofman@citrix.com>
- Cc: "robert@ocallahan.org" <robert@ocallahan.org>, "public-webrtc@w3.org" <public-webrtc@w3.org>, "public-media-capture@w3.org" <public-media-capture@w3.org>
- Message-ID: <CACBucrHmR2WicOBv5MjA8_2vXv6Jih1mvG9yDJwqf26J9S31pQ@mail.gmail.com>
Hi, Mathieu, A quickly reply for the use case seem under-specified currently. That will be supported by other stuff, OfflineMediaContext[1], going to address in the next step. Basically this specification one of the puzzle in a project called FoxEye[1]. You might check it out first. [1]: https://wiki.mozilla.org/Project_FoxEye#OfflineMediaContext: BR, CTai 2015-07-30 12:27 GMT+08:00 Mathieu Hofman <Mathieu.Hofman@citrix.com>: > I might have missed the "video processor" use case of the application > wanting to receive every single frame from the source without skipping any. > But this use case seem under-specified currently. In your proposal, the > decision of skipping frame seem to be left to the implementation: > > "Ideally the MediaStreamTrack should dispatch each video frame through > VideoProcessorEvent > <http://chiahungtai.github.io/mediacapture-worker/#idl-def-VideoProcessorEvent>. > But sometimes the worker thread could not process the frame in time. So the > implementation could skip the frame to avoid high memory footprint. In such > case, we might not be able to process every frame in a real time > MediaStream." > > If you want to support apps that are guaranteed delivery of every single > frame, you need make sure the processor implementation queues a frame > event for every new frame generated. But with a pure push mechanism, that > creates issues for apps that would like to skip frames. The app's js worker > would need to somehow be able to drain the queue of frame events and > process the last one. Without the worker having any knowledge if there is > any frames left in the queue, the skipping can become pretty convoluted > (one solution would be setTimeout(callback, 0); save "latest" frame in each > frame event handler invocation; start processing of "latest" frame in the > setTimeout callback). > This complexity is another reason why a built-in back-pressure channel is > beneficial. A solution here would be to add the ability to pause/resume the > processor's generation of frame events. An app than want to skip frames > would pause the processor when starting to process a frame, and resume it > when done. > > The complexity would be reversed with an async pull mechanism. The frame > skipping is obvious in this case: > function processNext () { > return processor.requestFrame().then(processFrame).then(processNext); > } > processNext(); > In this case, the worker only gets a new frame when it's done with the > previous one. If the requestFrame() function of the processor is > specified to only deliver frames that haven't been delivered by this > processor object before, you would always get a "new" frame. If the source > is real-time, we would most likely want the processor to internally cache > the "latest" frame and keep a flag to know if it has been delivered yet or > not. > Receiving all frames of a realtime source would be more convoluted with a > pure async pull mechanism: > var queue = []; > var processing = false; > function queueFrame(frameEvent) { > if (frameEvent) queue.push(frameEvent); > processor.requestFrame().then(queueFrame); > if (!processing) processNext(); > } > function processNext() { > if (processing || queue.length == 0) return; > processing = true; > processFrame(queue.unshift()).then(function() { > processing = false; > processNext(); > }); > } > queueFrame(); > > As I said, convoluted, but if the processFrame function is asynchronous > and yields back to the event loop frequently enough, this code should get > and queue every frame generated by the real-time source. > To solve this complexity maybe the processor could be constructed with a > "caching strategy", telling it to either cache all frames of real time > sources, or skip unneeded frames. > > Now from what I understand an API based on a ReadableStream [1] might > not solve all these use cases either: > Screen sharing as a pull source or push-source with back pressure support > would be trivial to support. > A push source with no back-pressure support (real-time webcam) could > simply enqueue frames in the readable stream. If the app wants to be able > to consume every single frame, there is no problem. If the app wants to > skip frames, draining the stream until the "latest" frame might be an issue > since the app has no way to know if a read from the stream will pull from > the queue or wait for the source. > An alternative would be to require the app to implement a writable stream > sink and create a writable stream passed to processor? > > To sum up, I agree that none of the originally suggested solutions seem > solve all the use cases. > I think there are 3 approaches from here: > - Open issues against the Streams spec to improve support for "skippable > streams" and use an API based on Streams > - use an async pull mechanism and add a "skipFrame" boolean option to the > processor's constructor > - use a push mechanism and add pause()/resume() operations to the > processor for back-pressure support > > Did I miss anything? > > Mathieu > > [1] https://streams.spec.whatwg.org/#rs > > ------------------------------ > *From:* Chia-Hung Tai [ctai@mozilla.com] > *Sent:* Wednesday, July 29, 2015 6:45 PM > *To:* Mathieu Hofman > *Cc:* robert@ocallahan.org; public-webrtc@w3.org; > public-media-capture@w3.org > *Subject:* Re: Add "MediaStream with worker" for video processing into > the new working items of WebRTC WG > > Hi, Mahieu, > I would like to support all use cases if I could. But the problem is I > can't image how to design an async pull APIs and an elegant sample codes to > guarantee process every frame? That will be great if you can show me some > concrete sample codes. I already tried to figure it out for a while. > Guarantee every frame is the most important reason why we choose push > mechanism. The key use case is video editing. > I think at least below use cases we need to take care. > 1. Real time camera processing for webrtc or camera recording => video > processor case > 2. Real time video analysis case. In this case, we only analysis frame > and don't modify the stream => video monitor case. > 3. Video editing case. We need to guarantee frame by frame processing. => > video processor case > 4. Screen sharing case. I think that is what you want. But I am not sure > what exactly what it would look like. > > I am not sure how to provide a solution for all those cases by async > pull. Would happy to learn from you. > > BR, > CTai > >
Received on Thursday, 30 July 2015 05:12:49 UTC