- From: Rob Manson <roBman@mob-labs.com>
- Date: Thu, 05 Sep 2013 12:50:38 +1000
- To: "public-media-capture@w3.org" <public-media-capture@w3.org>
Hi all,
I have a question about the relationship between the MediaStream Image
Capture API[1] and the MediaStream Recording API[2][3]...so I guess this
is to Travis, Jim and Giri but all feedback welcome 8)
More specifically I'm interested in the two different types of Image
Stream Processing pipelines these two different APIs create - this is
related to "3.3 Find the ball assignment" in the MediaStream Capture
Scenarios doc[4]...(but more scenarios related to this are coming from
the Augmented Web CG soon too).
From tracking the evolution of the Image Capture and Recording APIs I
think you could fairly paraphrase to say that in the context of Image
Stream Processing pipelines:
A - The MediaStream Image Capture API provides a single shot getFrame()
method that is designed to be used within a setTimeout() or
requestAnimationFrame() style event loop and it returns an ImageData
object.
NOTE: See the feedback from the WG members on the list about this event
loop decision[5].
B - The MediaStream Recording API provides a way to setup an event based
callback handler that is called whenever the "ondataavailable" event
fires and it returns a Blob object. The size of the Blob can be
controlled so only one or a few frames could be extracted in a timeslice.
NOTE: Also notice the secondary Blob->Typed Array pipeline that's needed
in this case as described in point 1 here[6].
"Blob is returned to the ondataavailable callback. ArrayBuffer is
created from Blob using FileReader. Typed array A is created from
ArrayBuffer."
So my question is, for web apps that want to do compute intensive Image
Stream Processing (e.g. feature detection, object recognition, gesture
recognition, general computer vision, etc.) what is the recommended
pipeline approach - A or B?
If it's A then there are a few other questions that come up. e.g.
Is this "event loop" model really the most efficient approach? And how
do we deal with timestamps across audio and image streams within video,
etc. so we can deliver real synchronisation[7]. Plus a range of related
questions I've already raised - See the NOTE: in the Technical Issues
section here[8].
If it's B then the question is "Is getFrame() and it's related plumbing
really required within the Image Capture API"? Also since I don't
believe Blob's are Transferable[9] then choosing B has performance
implications for processes that want to shift work off into a Web Worker
too (see comment above about B's secondary pipeline).
I'm currently working on a js lib to make setup and manipulation of
these types of pipelines as simple and standardised/optimised as
possible. But I'd really like to make sure I'm implementing the best
option.
And I'd also like to be sure that we've really discussed the longer term
impacts of choosing one method over another in terms of performance,
synchronsiation, etc.
Thoughts?
roBman
[1] http://www.w3.org/TR/image-capture/
[2] http://www.w3.org/TR/recording/ (soon)
[3]
https://dvcs.w3.org/hg/dap/raw-file/tip/media-stream-capture/MediaRecorder.html
[4]
https://dvcs.w3.org/hg/dap/raw-file/tip/media-stream-capture/scenarios.html#find-the-ball-assignment-media-processing-and-recording
[5]
http://lists.w3.org/Archives/Public/public-media-capture/2013May/0144.html
[6]
http://lists.w3.org/Archives/Public/public-media-capture/2012Nov/0102.html
[7] https://groups.google.com/d/msg/discuss-webrtc/VhuPHRCrFAM/GfrocO6tDtsJ
[8]
http://lists.w3.org/Archives/Public/public-media-capture/2013Jul/0101.html
[9] https://www.w3.org/Bugs/Public/show_bug.cgi?id=18611
Received on Thursday, 5 September 2013 02:51:09 UTC