- From: Timothy B. Terriberry <tterriberry@mozilla.com>
- Date: Wed, 11 Jul 2012 10:29:45 -0700
- To: "public-media-capture@w3.org" <public-media-capture@w3.org>
Young, Milan wrote: > I believe this newly proposed requirement **is** tied to existing > material in the spec. Section 5.10 reads: > > ¡°Local media stream captures are common in a variety of sharing > scenarios such as: > > capture a video and upload to a video sharing site > > capture a picture for my user profile picture in a given web app > > capture audio for a translation site > > capture a video chat/conference¡± > > I¡¯d argue that perhaps the first two and definitely the third scenario > require the application layer to have access to the media. 1) What you really want is not ex post facto access to the encoded form of data from a camera, but a general method of encoding a stream. As soon as you want to do any processing on the client side (even as simple as cropping, scaling, etc.) you're going to want to re-encode before uploading. At that point, I have no idea what this requirement has to do with capture. It applies equally to a MediaStream from any source. In practice in WebRTC , the encoding actually happens right before the data goes to the network, and the process is intimately tied to the real-time nature of RTP and the constraints of the network. An "encoded representation of the media" doesn't exist before that point. You could satisfy this use-case in some (non-ideal) form today by doing what Randell suggests (using WebRTC and capturing the RTP stream, a la SIPREC). That at least wouldn't require any additional spec work. 2) For the image capture case, you almost certainly don't want an encoded video stream, you want to encode an image. There's already a way to do this (via the Canvas APIs). 3) For translation (which implies speech recognition), a) if you're doing this on the client-side, you want access to the _uncompressed_ media, not the compressed form. Every re-compression step only makes your job harder, and b) if you're doing this on the server side, then latency becomes very important, and the RTP recording suggested in step 1 is actually what you want, not some offline storage format. 4) Again, if you want to record this on the server, you want access to the RTP (preferably at the conference mixer, assuming there is one). No need for a browser API for that case. If you want to record it on the client, you want the general encoding API outlined in 1), but again this has nothing to do with media capture (as in camera/microphone access). >From the scenarios outlined above, I'm still looking for where the MediaSource API (which "extends HTMLMediaElement to allow JavaScript to generate media streams for playback") becomes at all relevant. Please clue me in.
Received on Wednesday, 11 July 2012 17:30:18 UTC