Re: Draft WebAudio API review text

25.07.2013, 04:17, "Alex Russell" <slightlyoff@google.com>:
> Hi Sergey,
> On Mon, Jul 22, 2013 at 2:30 AM, Sergey Konstantinov <twirl@yandex-team.ru> wrote:
>> I read both Web Audio API Specification and Alex's review, and I have some comments on both.
>>
>> 1. On review
>>
>> First of all, I strongly support Alex in two cases:
>>
>> (a) Web Audio API and <audio> tag relations. In its current state both Web Audio and <audio> would be just bindings to some internal low-level audio API. In my opinion that's bad idea; Web Audio API design should allow to build <audio> on top of it as a pure high-level JavaScript component.
>>
>> (b) Audio nodes should be constructible. Let me express there even more extreme point: create* methods of AudioContext interface is just redundant and should be removed since there is no such use-case as to inherit the AudioContext class and to redefine factory methods. create* methods make a mess of AudioContext interface as only one (of 18) class methods is meaningful while the others are just helpers. That makes very hard to understand the real responsibility of AudioContext object.
>>
>> As far as I understand the very meaning of the AudioContext is the time notion for its nodes (which could be real time for base AudioContext and virtual time for OfflineAudioContext).
>
> I chatted with Chris Wilson about this last week and he suggests that AudioContext is designed to model some bit of underlying hardware. All of the operations it defines do the impedence matching for things from one format to something that the proximate hardware can consume directly. But the API here is deeply confused. How do you enumerate the hardware? What hardware is default? What if there is no audio hardware?

That's the point. There is AudioContext interface, and no one really knows its responsibility and how it works. If we are unable to understand it than how would developers do that?

>
>> But I failed to extract from the specification how exactly audio nodes interact with their context. There is no 'tick' event and no global timer, and it seems that all synchronisation and latency problems are somehow hidden in the AudioContext and audio nodes implementation. I'm strongly convicted that this is the wrong approach. In my opinion the low-level API should clearly reflect the internal principles of its organization.
>
> I'm not so sold on that. Making it possible for you to send audio to the OS is really the only job that such an API COULD do from the perspective of browser implementers. Doing more than that won't be portable and therefore isn't something we could do in a web API.
>
>> 2. On specification
>>
>> Regretfully I'm no specialist in 3d gaming or garage-band-like applications, but I see some obvious problems in using Web Audio API
>>
>> (a) There is no simple way to convert AudioContext to OfflineAudioContext,
>
> Interesting. I think the intuition is that you can use an OfflineAudioContext for bulk processing and use an AudioContext for real-time processing and playback.
>
>> so there is no simple way to upload the prepared composition. If you need to edit the composition in browser and then to save (upload) it, you have to create OfflineAudioContext, clone every single audio node (which is, again, complicated as they haven't "clone" methods)
>
> Right, good catch on the lack of serialization. Adding an issue for that in the document.
>
>> from the real-time context and then call startRendering(). My suggestion is (a) to transfer startRendering method to base AudioContext, (b) remove OfflineAudioContext (which is bad name in any case since it is really NotRealTimeAudioContext, not Offline) entirely.
>
> Yeah, calling it "BulkProcessingContext" might be better. The name really is bad.
>
>> (b) There is very odd statement in the AudioBuffer interface:
>>
>> " This interface represents a memory-resident audio asset (for one-shot sounds and other short audio clips). Its format is non-interleaved IEEE 32-bit linear PCM with a nominal range of -1 -> +1. It can contain one or more channels. Typically, it would be expected that the length of the PCM data would be fairly short (usually somewhat less than a minute). For longer sounds, such as music soundtracks, streaming should be used with the audio element and MediaElementAudioSourceNode."
>>
>> In first, what does it mean - "it would be expected that the length of the PCM data would be fairly short"? What will happen if the data is larger? There is no OutOfMemory exception and no explanation for that limit. How do we expect the developers to deal with that unpredictable constraint? Since we are dealing with binary data without any overhead - why to have any limit at all?
>
> I think this is about buffer sizes for real-time processing. The incentive to keep samples short is down to latency for processing them.

The problem is that Web Audio API spec just says nothing of working with larger data, except for "use <audio> element, Luke".
In my view, working with small data is a minor case, and working with large data is a major one. But it seems like all the API is designed to work with one-minute samples.
In any case, there MUST be EXACT limit constraint and corresponding events and errors, not just "it would be expected that..."

>
>> In second, the 1 minute limit is clearly insufficient for the declared purpose of making 3d games and audio editors in browser.
>
> Hrm. I think we should take this to their mailing list. I don't understand the way people want to use this to konw if it's real hazard
>
>> In third, fallback to <audio> element isn't really an option as we are agreed that audio element should be implemented in terms of Web Audio API, not vice versa.
>>
>> So, in my opinion, the Web Audio API in its current state doesn't provide appropriate interface for games and audio editors.

-- 
Konstantinov Sergey
Yandex Maps API Development Team Lead
http://api.yandex.com/maps/

Received on Thursday, 25 July 2013 09:35:08 UTC