Re: Resolution to republish MSP as a note from Jussi Kalliokoski on 2012-08-08 (public-audio@w3.org from July to September 2012)

From: Jussi Kalliokoski <jussi.kalliokoski@gmail.com>
Date: Wed, 8 Aug 2012 15:56:04 +0300
To: Srikumar Karaikudi Subramanian <srikumarks@gmail.com>
Cc: Chris Wilson <cwilso@google.com>, public-audio@w3.org
Message-ID: <CAJhzemWO0o_VGbfLwwfhWiHJWD-XAdmqc_F2aGzgFphn9D45qg@mail.gmail.com>

On Wed, Aug 8, 2012 at 9:36 AM, Srikumar Karaikudi Subramanian <
srikumarks@gmail.com> wrote:

> 1) For one (a gargantual one), it offers a solution for video/picture as
> well. This would make it less of a problem to drop main thread support for
> audio processing as well, as for example in the biggest blocker case
> (emulators) the graphics could be manipulated from the same thread as the
> audio. [1]
>
>
> Trying to solve both video and audio using the same graph is the holy
> grail of graph based designs, but this can get *really* crazy for an API
> user. One of the frameworks that did try to accommodate both media types
> and solve the associated stream synchronization, graph rewinding/seeking,
> pause/play, etc. was DirectShow. It was a *beast* to deal with as an API
> and I shudder to think of something like that for use on the web. (*)
>

That is a very good point. But we already have intertwined audio/video on
the web, both with media elements and WebRTC, so in my opinion we should
design extending APIs accordingly. The biggest problem I have with the Web
Audio API is that rather than building on the existing building blocks in
the web platform it's a very detached design, with recent attempts to patch
the problem by extending the API by making presentations of the media
elements and media streams that fit the API rather than vice versa.

I don't think it will be that difficult for the API user either, at least
MSP API as it currently stands isn't any more complicated to use than the
Web Audio API, at least from the audio perspective. And to me, the approach
to video seems reasonable as well, but obviously I haven't tried.

> My gut feel at the moment is that a usable web audio api spec will never
> see the light of day if the group goes the route of solving both video and
> audio within the same architecture. My only hope is that I'm wrong. The way
> effect composition works when using WebGL is *very* different from the way
> effect composition works for audio. "Audio effect = audio node" doesn't
> have a "video effect = video node" parallel in an architecture where "node"
> means the same thing. They are both graphs (scene graph and signal flow
> graph) but they have very different relationships to time.  (**)
>

I share your concerns. But on the other hand I think we might even get
where we're going faster if we just let go of the need for native nodes and
the rest of the complexity that comes with them.

> What I mean is that when you multiplex the nodes in a graph into a single
> node, you can optimize it directly as Marcus said, at which point the
> performance might be better, or worse to a degree that's insignificant and
> becomes less significant over time.
>
>
> This might happen sooner than we expect I think, since V8 already
> optimizes some numerics to use SSE2/3 if I'm not mistaken.
>
> At that point, what's the virtue of Web Audio API vs MSP API & DSP API?
>
>
> The web architecture may not permit the use of JS code in high priority
> system audio callbacks for some time. That means the latency we can get
> from native nodes is going to be better than JS for some time to come.
>

I'm probably badly misinformed, but the value of high priority threads
seems a bit vague to me, since I'm not sure about what's the OS support
level for high-priority threads, I think for example in Linux you still
have to compile your own kernel to get real high priority thread support.
And using high-priority threads might not always even be desirable, for
example in low-end devices it'd be horrible if the UI became completely
unusable because an audio thread was occupying the whole thread.

My question really is whether the added value is high enough to justify the
complexity in design/implementation/maintainability of the API when a lot
of that value is likely to become obsolete in the future.

Cheers,
Jussi

Received on Wednesday, 8 August 2012 12:56:38 UTC