- From: Rob Manson <roBman@mob-labs.com>
- Date: Thu, 05 Sep 2013 12:50:38 +1000
- To: "public-media-capture@w3.org" <public-media-capture@w3.org>
Hi all, I have a question about the relationship between the MediaStream Image Capture API[1] and the MediaStream Recording API[2][3]...so I guess this is to Travis, Jim and Giri but all feedback welcome 8) More specifically I'm interested in the two different types of Image Stream Processing pipelines these two different APIs create - this is related to "3.3 Find the ball assignment" in the MediaStream Capture Scenarios doc[4]...(but more scenarios related to this are coming from the Augmented Web CG soon too). From tracking the evolution of the Image Capture and Recording APIs I think you could fairly paraphrase to say that in the context of Image Stream Processing pipelines: A - The MediaStream Image Capture API provides a single shot getFrame() method that is designed to be used within a setTimeout() or requestAnimationFrame() style event loop and it returns an ImageData object. NOTE: See the feedback from the WG members on the list about this event loop decision[5]. B - The MediaStream Recording API provides a way to setup an event based callback handler that is called whenever the "ondataavailable" event fires and it returns a Blob object. The size of the Blob can be controlled so only one or a few frames could be extracted in a timeslice. NOTE: Also notice the secondary Blob->Typed Array pipeline that's needed in this case as described in point 1 here[6]. "Blob is returned to the ondataavailable callback. ArrayBuffer is created from Blob using FileReader. Typed array A is created from ArrayBuffer." So my question is, for web apps that want to do compute intensive Image Stream Processing (e.g. feature detection, object recognition, gesture recognition, general computer vision, etc.) what is the recommended pipeline approach - A or B? If it's A then there are a few other questions that come up. e.g. Is this "event loop" model really the most efficient approach? And how do we deal with timestamps across audio and image streams within video, etc. so we can deliver real synchronisation[7]. Plus a range of related questions I've already raised - See the NOTE: in the Technical Issues section here[8]. If it's B then the question is "Is getFrame() and it's related plumbing really required within the Image Capture API"? Also since I don't believe Blob's are Transferable[9] then choosing B has performance implications for processes that want to shift work off into a Web Worker too (see comment above about B's secondary pipeline). I'm currently working on a js lib to make setup and manipulation of these types of pipelines as simple and standardised/optimised as possible. But I'd really like to make sure I'm implementing the best option. And I'd also like to be sure that we've really discussed the longer term impacts of choosing one method over another in terms of performance, synchronsiation, etc. Thoughts? roBman [1] http://www.w3.org/TR/image-capture/ [2] http://www.w3.org/TR/recording/ (soon) [3] https://dvcs.w3.org/hg/dap/raw-file/tip/media-stream-capture/MediaRecorder.html [4] https://dvcs.w3.org/hg/dap/raw-file/tip/media-stream-capture/scenarios.html#find-the-ball-assignment-media-processing-and-recording [5] http://lists.w3.org/Archives/Public/public-media-capture/2013May/0144.html [6] http://lists.w3.org/Archives/Public/public-media-capture/2012Nov/0102.html [7] https://groups.google.com/d/msg/discuss-webrtc/VhuPHRCrFAM/GfrocO6tDtsJ [8] http://lists.w3.org/Archives/Public/public-media-capture/2013Jul/0101.html [9] https://www.w3.org/Bugs/Public/show_bug.cgi?id=18611
Received on Thursday, 5 September 2013 02:51:09 UTC