Re: Draft WebAudio API review text from Sergey Konstantinov on 2013-07-22 (www-tag@w3.org from July 2013)

From: Sergey Konstantinov <twirl@yandex-team.ru>
Date: Mon, 22 Jul 2013 13:30:34 +0400
To: www-tag@w3.org, Alex Russell <slightlyoff@google.com>
Message-ID: <51ECFBBA.2060601@yandex-team.ru>
I read both Web Audio API Specification and Alex's review, and I have 
some comments on both.

1. On review

First of all, I strongly support Alex in two cases:

(a) Web Audio API and <audio> tag relations. In its current state both 
Web Audio and <audio> would be just bindings to some internal low-level 
audio API. In my opinion that's bad idea; Web Audio API design should 
allow to build <audio> on top of it as a pure high-level JavaScript 
component.

(b) Audio nodes should be constructible. Let me express there even more 
extreme point: create* methods of AudioContext interface is just 
redundant and should be removed since there is no such use-case as to 
inherit the AudioContext class and to redefine factory methods. create* 
methods make a mess of AudioContext interface as only one (of 18) class 
methods is meaningful while the others are just helpers. That makes very 
hard to understand the real responsibility of AudioContext object.

As far as I understand the very meaning of the AudioContext is the time 
notion for its nodes (which could be real time for base AudioContext and 
virtual time for OfflineAudioContext). But I failed to extract from the 
specification how exactly audio nodes interact with their context. There 
is no 'tick' event and no global timer, and it seems that all 
synchronisation and latency problems are somehow hidden in the 
AudioContext and audio nodes implementation. I'm strongly convicted that 
this is the wrong approach. In my opinion the low-level API should 
clearly reflect the internal principles of its organization.

2. On specification

Regretfully I'm no specialist in 3d gaming or garage-band-like 
applications, but I see some obvious problems in using Web Audio API

(a) There is no simple way to convert AudioContext to 
OfflineAudioContext, so there is no simple way to upload the prepared 
composition. If you need to edit the composition in browser and then to 
save (upload) it, you have to create OfflineAudioContext, clone every 
single audio node (which is, again, complicated as they haven't "clone" 
methods) from the real-time context and then call startRendering(). My 
suggestion is (a) to transfer startRendering method to base 
AudioContext, (b) remove OfflineAudioContext (which is bad name in any 
case since it is really NotRealTimeAudioContext, not Offline) entirely.

(b) There is very odd statement in the AudioBuffer interface:

" This interface represents a memory-resident audio asset (for one-shot 
sounds and other short audio clips). Its format is non-interleaved IEEE 
32-bit linear PCM with a nominal range of -1 -> +1. It can contain one 
or more channels. Typically, it would be expected that the length of the 
PCM data would be fairly short (usually somewhat less than a minute). 
For longer sounds, such as music soundtracks, streaming should be used 
with the |audio| element and |MediaElementAudioSourceNode|."

In first, what does it mean - "it would be expected that the length of 
the PCM data would be fairly short"? What will happen if the data is 
larger? There is no OutOfMemory exception and no explanation for that 
limit. How do we expect the developers to deal with that unpredictable 
constraint? Since we are dealing with binary data without any overhead - 
why to have any limit at all?

In second, the 1 minute limit is clearly insufficient for the declared 
purpose of making 3d games and audio editors in browser.

In third, fallback to <audio> element isn't really an option as we are 
agreed that audio element should be implemented in terms of Web Audio 
API, not vice versa.

So, in my opinion, the Web Audio API in its current state doesn't 
provide appropriate interface for games and audio editors.
Received on Monday, 22 July 2013 09:30:53 UTC