Re: Media Source draft proposal from Robert O'Callahan on 2012-04-20 (public-html@w3.org from April 2012)

From: Robert O'Callahan <robert@ocallahan.org>
Date: Fri, 20 Apr 2012 17:06:44 +1200
To: Maciej Stachowiak <mjs@apple.com>
Cc: Aaron Colwell <acolwell@google.com>, public-html@w3.org, Adrian Bateman <adrianba@microsoft.com>, Mark Watson <watsonm@netflix.com>
Message-ID: <CAOp6jLaRYXOCFwSwGZzES5bsbWLab7xnsM1u=uPkkGNKkMQNNA@mail.gmail.com>

On Fri, Apr 20, 2012 at 3:19 PM, Maciej Stachowiak <mjs@apple.com> wrote:

> From what I can tell, it's not the case that WebRTC manipulates only
> decoded data.Twp examples: PeerConnection lets you send a MediaStream to a
> remote peer and receive a MediaStream from remote peer. Surely it is not
> the case that media data sent over PeerConnection is always decoded? It
> seems obvious that such data would have to be encoded at least in transit.
> Likewise, the getRecordedData method in WebRTC generates "a file that
> containing data in a *format supported by the user agent* for use
> in audio and video elements" (emphasis added), which is surely encoded
> data, not decoded data. In fact, I cannot find any case where the WebRTC
> spec offers any kind of access to decoded media data.
>

Fair. I've proposed and prototyped access to the decoded data, here:
https://dvcs.w3.org/hg/audio/raw-file/tip/streams/StreamProcessing.html
http://people.mozilla.org/~roc/stream-demos/

The WebRTC API is fairly high-level. Apart from the recorder API, which
necessarily has to return a blob of encoded data, it abstracts over
encoding, formats, containers and the like. The Web author doesn't have to
(and can't) access or manage buffers of encoded data. Everything that is
exposed, such as metadata about tracks, is post-parsing/decoding. That's
what I meant, but I expressed it badly.

It might be that there is a good reason why receiving a media stream over a
> peer-to-peer connection and then playing it via a video element should use
> a completely different API than receiving a stream from a server and then
> playing it via a video element.
>

I think one reason is that nontrivial manipulation of encoded data buffers
(beyond "decode/encode an entire resource from/to a binary blob") is a pain
and you don't want to do it unless you have to --- especially in JS running
on the HTML event loop --- so having as many APIs as possible (HTML media
elements, WebRTC, Web Audio and MediaStreams Processing) provide
abstractions to avoid that is a very good thing. But when you need to
manipulate encoded data buffers, there's Media Source --- for the input
side at least.

I think these specs compose together fairly well. With Media Source and my
proposed extensions to extract a MediaStream from an HTML media element,
you could dynamically compose a media resource from encoded data buffers
and play it into a MediaStream.

Having said that, closer coordination of these specs is probably still a
good idea just to ensure the parts can be composed to handle all the
use-cases.

Rob
-- 
“You have heard that it was said, ‘Love your neighbor and hate your enemy.’
But I tell you, love your enemies and pray for those who persecute you,
that you may be children of your Father in heaven. ... If you love those
who love you, what reward will you get? Are not even the tax collectors
doing that? And if you greet only your own people, what are you doing more
than others?" [Matthew 5:43-47]

Received on Friday, 20 April 2012 05:07:15 UTC