Re: Web Audio API is now available in Chrome

Hi Chris,

On Wed, Feb 2, 2011 at 10:05 AM, Chris Rogers <crogers@google.com> wrote:
> Hi Silvia, thanks for your comments.  I have a few responses below:
>
> On Tue, Feb 1, 2011 at 1:43 PM, Silvia Pfeiffer <silviapfeiffer1@gmail.com>
> wrote:
>>
>> On Wed, Feb 2, 2011 at 6:54 AM, Chris Rogers <crogers@google.com> wrote:
>> > Hi Tom,
>> > They are different API proposals.  The "Web Audio API" which I just
>> > wrote
>> > about is described here:
>> >
>> > http://chromium.googlecode.com/svn/trunk/samples/audio/specification/specification.html
>> > Mozilla's proposal is called the "Audio Data API":
>> > https://wiki.mozilla.org/Audio_Data_API
>> > There has been a fair amount of discussion about the two approaches on
>> > this
>> > list.  Here is my comparison:
>> > Web Audio API
>> > * implementations in WebKit - Google Chrome (Mac OS X only, but Windows
>> > and
>> > Linux soon), Apple Safari
>> > * high-level API - easy to do simple things like play sound now
>> > * API is modular and scalable
>> > * allows for the lowest possible latency - time between, for example,
>> > key
>> > and mouse events and a sound being heard
>> > * most of the implementation is in optimized assembly / C / C++ for
>> > efficiency, so more can be done without bogging down the system
>> > * more resistant to audio glitches / dropouts
>> > * superset of Audio Data API functionality
>>
>> That's an unfair comparison: the Web Audio API is in no way shape or
>> form a superset of the Audio Data API functionality. For one: it
>> doesn't integrate with the Audio() API of the existing <audio> element
>> of HTML5.
>
> When I say "superset" I mean in functionality, not in the actual API itself.
>  Put in other words, any application written using the Audio Data API should
> be possible to write with the Web Audio API.

This is what I meant by being unfair: I'm 100% sure that everything
that is possible in the Web Audio API is possible in the Audio Data
API and vice versa. Performance may differ, but the functionality is
possible. Therefore, we should not be using this as an argument for or
against one or the other.


> The Web Audio API *does* interact with the <audio> tag.  Please see:
> http://chromium.googlecode.com/svn/trunk/samples/audio/specification/specification.html#MediaElementAudioSourceNode-section
> And the diagram and example code here:
> http://chromium.googlecode.com/svn/trunk/samples/audio/specification/specification.html#DynamicLifetime-section
> To be fair, I don't have the MediaElementSourceNode implemented yet, but I
> do believe it's an important part of the specification.

None of this hooks into the <audio> element and the existing Audio()
function of HTML5: see
http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#audio
. It creates its own AudioNode() and  AudioSourceNode(). This is where
I would like to see an explicit integration with HTML5 and not a
replication of functionality.


>>
>> It also makes it really difficult to access individual audio
>> samples and manipulate them.
>
> I don't believe this is true.  Please see the section about
> JavaScriptAudioNode:
> http://chromium.googlecode.com/svn/trunk/samples/audio/specification/specification.html#JavaScriptAudioNode-section
> And try out the simple example (view source to see how simple the code is):
> http://chromium.googlecode.com/svn/trunk/samples/audio/javascript-processing.html
> In particular the process() method is similar to the code which would be
> used in the Audio Data API, and the approach taken is similar to that which
> has been used for Flash audio.

Complexity is a relative measure. What might look simple to you may
not look simple to me. So, let's drop the argument of complexity - it
will lead us nowhere.


>>
>> And finally, the Web Audio API only
>> implements a certain set of audio manipulation functions in C/C++ - if
>> a developer needs more flexibility, they have to use the JavaScript
>> way here, too.
>
> This is true, but I think the set of functions will be useful in a large set
> of applications.  They can use custom JavaScript processing in special
> cases.


There is no doubt. I agree that these functions are useful and it will
be very important to have them in C/C++ and be able to build a filter
graph. What I'm trying to achieve is fairness in the discussion
between the two APIs and the realization that both approaches are
important to achieve.


>> > * more limited audio capabilities
>> I'd argue the other way around: since you have access to the audio
>> samples directly, the Audio Data API IMO has more flexible audio
>> capabilities. Anything can be done once you have access to the samples
>> directly.
>
> First of all the Web Audio API also offers easy and direct access to the
> samples similar to Flash and offering the same kind of low-level
> manipulation as the Mozilla API.  Second, I think it's somewhat of an
> overstatement to say that "anything" can be done because it glosses over
> some very practical and real limitations like latency, audio breakup, and
> scalability.  Developers are going to be faced with these issues in a very
> practical sense.

Yes, but it is possible to screw up code in any situation. There are
possibilities to achieve performance in both approaches.


>>
>> My description of the comparison is that the Audio Data API is a
>> low-level API that allows direct access to samples and to manipulating
>> them in JavaScript with your own features. It does require either a
>> good JavaScript library or a good audio coder to achieve higher level
>> functionality such as frequency transforms or filters. But it provides
>> the sophisticated audio programmer with all the flexibility - alas
>> with the drawback of having to do their own optimisation of code to
>> achieve low latency.
>
> I agree with most of this except the part about latency and JavaScript
> optimization.  There are other factors at play having to do with threading,
> garbage collection, etc. which make latency a nagging issue no matter how
> much the JavaScript code is optimized.

Possibly. But I don't think that's per se an argument against that
interface. Many examples have been shown with the Audio Data API where
latency did not occur or was not an issue. Just like Canvas and SVG
have advantages and disadvantages for specific situations, this is
also the case here. To me it clearly is not a matter of either or, but
a matter of getting both.


>
>>
>> In comparison, the Web Audio API is built like traditional audio
>> frameworks as a set of audio filters than can be composed together in
>> a graph and then kicked off to let the browser take advantage of its
>> optimised implementations of typical audio filters and achieve the
>> required low latency. By providing a by nature limited set of audio
>> filters, the audio programmer is restricted to combining these filters
>> in a means that achieves their goals. If a required filter is not
>> available, they can implement it in JavaScript and hook it into the
>> filter graph.
>
> I think that's pretty accurate, but I think in many (probably most)
> applications it will never be necessary to write custom DSP code in
> JavaScript since the provided filters have been proven in decades of use in
> real-world audio applications to be very useful.
>

That would be an advantage. Just like the SVG functions also help to
satisfy most graphics use cases. But not all, which is what I am
trying to point out here, too.

>>
>> In my opinion, the difference between the Web Audio API and the Audio
>> Data API is very similar to the difference between SVG and Canvas. The
>> Web Audio API is similar to SVG in that it provides "objects" that can
>> be composed together to create a presentation. The Audio Data API is
>> similar to Canvas in that it provides pixels to manipulate. Both have
>> their use cases and community. So, similarly, I would hope that we can
>> get both audio APIs into HTML5.
>
> I've tried to incorporate the features of the Audio Data API into the Web
> Audio API with the introduction of JavaScriptAudioNode
> and MediaElementAudioSourceNode.  So, in a sense I believe we already have
> the required features which you desire.

Working with the API I have felt it clunky and not quite integrated
with the existing HTML5 specification yet, when in contrast the Web
Audio API has extended the Audio() element with a few extra fields and
an event to make it all happen. I believe there would be a better way
to take a similar approach where we don't actually need an
AudioContext() and the Audio() element already creates an
AudioContext(). That would make the API a lot more elegant and would
remove some replication.

Regards,
Silvia.

Received on Tuesday, 1 February 2011 23:39:11 UTC