Re: Resolution to republish MSP as a note from Srikumar Karaikudi Subramanian on 2012-08-08 (public-audio@w3.org from July to September 2012)

From: Srikumar Karaikudi Subramanian <srikumarks@gmail.com>
Date: Wed, 8 Aug 2012 14:36:47 +0800
To: Jussi Kalliokoski <jussi.kalliokoski@gmail.com>
Cc: Chris Wilson <cwilso@google.com>, public-audio@w3.org
Message-Id: <0B5A4487-8FB8-4C9B-A305-AA776BF15F11@gmail.com>
> 1) For one (a gargantual one), it offers a solution for video/picture as well. This would make it less of a problem to drop main thread support for audio processing as well, as for example in the biggest blocker case (emulators) the graphics could be manipulated from the same thread as the audio. [1]

Trying to solve both video and audio using the same graph is the holy grail of graph based designs, but this can get *really* crazy for an API user. One of the frameworks that did try to accommodate both media types and solve the associated stream synchronization, graph rewinding/seeking, pause/play, etc. was DirectShow. It was a *beast* to deal with as an API and I shudder to think of something like that for use on the web. (*)

My gut feel at the moment is that a usable web audio api spec will never see the light of day if the group goes the route of solving both video and audio within the same architecture. My only hope is that I'm wrong. The way effect composition works when using WebGL is *very* different from the way effect composition works for audio. "Audio effect = audio node" doesn't have a "video effect = video node" parallel in an architecture where "node" means the same thing. They are both graphs (scene graph and signal flow graph) but they have very different relationships to time.  (**)

> What I mean is that when you multiplex the nodes in a graph into a single node, you can optimize it directly as Marcus said, at which point the performance might be better, or worse to a degree that's insignificant and becomes less significant over time. 

This might happen sooner than we expect I think, since V8 already optimizes some numerics to use SSE2/3 if I'm not mistaken. 

> At that point, what's the virtue of Web Audio API vs MSP API & DSP API?

The web architecture may not permit the use of JS code in high priority system audio callbacks for some time. That means the latency we can get from native nodes is going to be better than JS for some time to come.

Best,
-Kumar

(*) DirectShow did have a few things that we don't have with the web audio api yet - graph traversal, node inspection and creation of custom nodes that looked and quacked like the bundled nodes.

(**) My claim to comment on this aspect comes from my work on the rendering architecture for "muvee Reveal" - an automatic video editing product.

On 8 Aug, 2012, at 1:56 AM, Jussi Kalliokoski <jussi.kalliokoski@gmail.com> wrote:

> On Mon, Aug 6, 2012 at 9:59 PM, Chris Wilson <cwilso@google.com> wrote:
> On Mon, Aug 6, 2012 at 11:31 AM, Jussi Kalliokoski <jussi.kalliokoski@gmail.com> wrote:
> I believe that if we strip all the native nodes from the Web Audio API, the MSP API offers a solution to far more problems than the Web Audio API. Not to mention that combined with Marcus' DSP API, even the performance benefits can at least be equaled in many cases.
> 
> Obviously, I'm not in favor of this (no surprise, I'm sure).  I would like to ask, 1) how does the MSP API offer a solution to far more problems than the WA API?  Other than the presumably more scalable latency, I don't see how it's really any different; I think the concerns Peter raised were about latency in using multiple JS nodes in WA, while still trying to use the routing system in significant and interesting ways.  2) Can you elaborate on how the performance can be "at least" equaled?  I don't see how you expect to get significantly better performance than natively-implemented code doing DSP operations on a separate high-priority thread.
>  
> 1) For one (a gargantual one), it offers a solution for video/picture as well. This would make it less of a problem to drop main thread support for audio processing as well, as for example in the biggest blocker case (emulators) the graphics could be manipulated from the same thread as the audio. [1]
> 2) Marcus expressed my meaning behind this thought. And no, I didn't mean that the MSP API would magically have better performing worker processors (as Chris R pointed out). What I mean is that when you multiplex the nodes in a graph into a single node, you can optimize it directly as Marcus said, at which point the performance might be better, or worse to a degree that's insignificant and becomes less significant over time. At that point, what's the virtue of Web Audio API vs MSP API & DSP API?
> 
> Cheers,
> Jussi
> 
> [1] https://www.w3.org/Bugs/Public/show_bug.cgi?id=17415
Received on Wednesday, 8 August 2012 06:37:17 UTC