Re: Draft WebAudio API review text from Alex Russell on 2013-07-25 (www-tag@w3.org from July 2013)

From: Alex Russell <slightlyoff@google.com>
Date: Thu, 25 Jul 2013 11:52:27 -0700
To: Konstantinov Sergey <twirl@yandex-team.ru>
Cc: "www-tag@w3.org List" <www-tag@w3.org>
Message-ID: <CANr5HFU-DytDPLHLM7j-tq=4LdB1Jkmdi8VePCaYS3BhiXJH9w@mail.gmail.com>
On Thu, Jul 25, 2013 at 2:34 AM, Konstantinov Sergey
<twirl@yandex-team.ru>wrote:

> 25.07.2013, 04:17, "Alex Russell" <slightlyoff@google.com>:
> > Hi Sergey,
> > On Mon, Jul 22, 2013 at 2:30 AM, Sergey Konstantinov <
> twirl@yandex-team.ru> wrote:
> >> I read both Web Audio API Specification and Alex's review, and I have
> some comments on both.
> >>
> >> 1. On review
> >>
> >> First of all, I strongly support Alex in two cases:
> >>
> >> (a) Web Audio API and <audio> tag relations. In its current state both
> Web Audio and <audio> would be just bindings to some internal low-level
> audio API. In my opinion that's bad idea; Web Audio API design should allow
> to build <audio> on top of it as a pure high-level JavaScript component.
> >>
> >> (b) Audio nodes should be constructible. Let me express there even more
> extreme point: create* methods of AudioContext interface is just redundant
> and should be removed since there is no such use-case as to inherit the
> AudioContext class and to redefine factory methods. create* methods make a
> mess of AudioContext interface as only one (of 18) class methods is
> meaningful while the others are just helpers. That makes very hard to
> understand the real responsibility of AudioContext object.
> >>
> >> As far as I understand the very meaning of the AudioContext is the time
> notion for its nodes (which could be real time for base AudioContext and
> virtual time for OfflineAudioContext).
> >
> > I chatted with Chris Wilson about this last week and he suggests that
> AudioContext is designed to model some bit of underlying hardware. All of
> the operations it defines do the impedence matching for things from one
> format to something that the proximate hardware can consume directly. But
> the API here is deeply confused. How do you enumerate the hardware? What
> hardware is default? What if there is no audio hardware?
>
> That's the point. There is AudioContext interface, and no one really knows
> its responsibility and how it works. If we are unable to understand it than
> how would developers do that?
>

Right. I'm adding a callout for this in the document. Glad we agree = )


>  >> But I failed to extract from the specification how exactly audio nodes
> interact with their context. There is no 'tick' event and no global timer,
> and it seems that all synchronisation and latency problems are somehow
> hidden in the AudioContext and audio nodes implementation. I'm strongly
> convicted that this is the wrong approach. In my opinion the low-level API
> should clearly reflect the internal principles of its organization.
> >
> > I'm not so sold on that. Making it possible for you to send audio to the
> OS is really the only job that such an API COULD do from the perspective of
> browser implementers. Doing more than that won't be portable and therefore
> isn't something we could do in a web API.
> >
> >> 2. On specification
> >>
> >> Regretfully I'm no specialist in 3d gaming or garage-band-like
> applications, but I see some obvious problems in using Web Audio API
> >>
> >> (a) There is no simple way to convert AudioContext to
> OfflineAudioContext,
> >
> > Interesting. I think the intuition is that you can use an
> OfflineAudioContext for bulk processing and use an AudioContext for
> real-time processing and playback.
> >
> >> so there is no simple way to upload the prepared composition. If you
> need to edit the composition in browser and then to save (upload) it, you
> have to create OfflineAudioContext, clone every single audio node (which
> is, again, complicated as they haven't "clone" methods)
> >
> > Right, good catch on the lack of serialization. Adding an issue for that
> in the document.
> >
> >> from the real-time context and then call startRendering(). My
> suggestion is (a) to transfer startRendering method to base AudioContext,
> (b) remove OfflineAudioContext (which is bad name in any case since it is
> really NotRealTimeAudioContext, not Offline) entirely.
> >
> > Yeah, calling it "BulkProcessingContext" might be better. The name
> really is bad.
> >
> >> (b) There is very odd statement in the AudioBuffer interface:
> >>
> >> " This interface represents a memory-resident audio asset (for one-shot
> sounds and other short audio clips). Its format is non-interleaved IEEE
> 32-bit linear PCM with a nominal range of -1 -> +1. It can contain one or
> more channels. Typically, it would be expected that the length of the PCM
> data would be fairly short (usually somewhat less than a minute). For
> longer sounds, such as music soundtracks, streaming should be used with the
> audio element and MediaElementAudioSourceNode."
> >>
> >> In first, what does it mean - "it would be expected that the length of
> the PCM data would be fairly short"? What will happen if the data is
> larger? There is no OutOfMemory exception and no explanation for that
> limit. How do we expect the developers to deal with that unpredictable
> constraint? Since we are dealing with binary data without any overhead -
> why to have any limit at all?
> >
> > I think this is about buffer sizes for real-time processing. The
> incentive to keep samples short is down to latency for processing them.
>
> The problem is that Web Audio API spec just says nothing of working with
> larger data, except for "use <audio> element, Luke".
> In my view, working with small data is a minor case, and working with
> large data is a major one. But it seems like all the API is designed to
> work with one-minute samples.
> In any case, there MUST be EXACT limit constraint and corresponding events
> and errors, not just "it would be expected that..."


This sounds like a scoping issue. Lets bring it up on their list separate
from this document.


> >> In second, the 1 minute limit is clearly insufficient for the declared
> purpose of making 3d games and audio editors in browser.
> >
> > Hrm. I think we should take this to their mailing list. I don't
> understand the way people want to use this to konw if it's real hazard
> >
> >> In third, fallback to <audio> element isn't really an option as we are
> agreed that audio element should be implemented in terms of Web Audio API,
> not vice versa.
> >>
> >> So, in my opinion, the Web Audio API in its current state doesn't
> provide appropriate interface for games and audio editors.
>
> --
> Konstantinov Sergey
> Yandex Maps API Development Team Lead
> http://api.yandex.com/maps/
>
Received on Thursday, 25 July 2013 18:53:25 UTC