Re: Web Audio API is now available in Chrome from Jussi Kalliokoski on 2011-02-02 (public-xg-audio@w3.org from February 2011)

From: Jussi Kalliokoski <jussi.kalliokoski@gmail.com>
Date: Wed, 2 Feb 2011 07:55:13 +0200
To: Silvia Pfeiffer <silviapfeiffer1@gmail.com>
Cc: Chris Rogers <crogers@google.com>, "Tom White (MMA)" <lists@midi.org>, public-xg-audio@w3.org
Message-ID: <AANLkTikVAbeGKV=YgCq0itGc0TysuWG1yG+pN+9oBqdC@mail.gmail.com>
Hi, having worked with only Audio Data API so far, but yet having read the
specification for Web Audio API, I'll jump in.

On Wed, Feb 2, 2011 at 2:30 AM, Silvia Pfeiffer
<silviapfeiffer1@gmail.com>wrote:

> On Wed, Feb 2, 2011 at 11:06 AM, Chris Rogers <crogers@google.com> wrote:
> >
> >
> > On Tue, Feb 1, 2011 at 3:38 PM, Silvia Pfeiffer <
> silviapfeiffer1@gmail.com>
> > wrote:
> >>
> >> >> > * superset of Audio Data API functionality
> >> >>
> >> >> That's an unfair comparison: the Web Audio API is in no way shape or
> >> >> form a superset of the Audio Data API functionality. For one: it
> >> >> doesn't integrate with the Audio() API of the existing <audio>
> element
> >> >> of HTML5.
> >> >
> >> > When I say "superset" I mean in functionality, not in the actual API
> >> > itself.
> >> >  Put in other words, any application written using the Audio Data API
> >> > should
> >> > be possible to write with the Web Audio API.
> >>
> >> This is what I meant by being unfair: I'm 100% sure that everything
> >> that is possible in the Web Audio API is possible in the Audio Data
> >> API and vice versa. Performance may differ, but the functionality is
> >> possible. Therefore, we should not be using this as an argument for or
> >> against one or the other.
> >
> > I think that performance is a valid argument for using one versus the
> other.
> >  Using your same logic one could argue that directly manipulating pixels
> > using ImageData is sufficient to get any kind of graphics rendering that
> is
> > possible in WebGL.
>
> For most image manipulation work you will not need WebGL. We are not
> forcing everyone to use WebGL. Why should we do that with audio?


I don't think the WebGL comparison works that well here, since it's simpler
to get started with JS audio processing in Web Audio API than in Audio Data
API, although Web Audio API being a higher level API, whereas canvas native
methods are easier than WebGL, but WebGL is a higher level API.


>
> >>
> >> > The Web Audio API *does* interact with the <audio> tag.  Please see:
> >> >
> >> >
> http://chromium.googlecode.com/svn/trunk/samples/audio/specification/specification.html#MediaElementAudioSourceNode-section
> >> > And the diagram and example code here:
> >> >
> >> >
> http://chromium.googlecode.com/svn/trunk/samples/audio/specification/specification.html#DynamicLifetime-section
> >> > To be fair, I don't have the MediaElementSourceNode implemented yet,
> but
> >> > I
> >> > do believe it's an important part of the specification.
> >>
> >> None of this hooks into the <audio> element and the existing Audio()
> >> function of HTML5: see
> >>
> >>
> http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#audio
> >> . It creates its own AudioNode() and  AudioSourceNode(). This is where
> >> I would like to see an explicit integration with HTML5 and not a
> >> replication of functionality.
> >
> > I'm not sure what your point is.  MediaElementSourceNode has a very
> direct
> > relationship  uses an <audio> element.
>
> They are all subclasses of AudioNode(), not of Audio(). You just have
> to look at your implementation examples. There is nowhere an <audio>
> element or a call to the Audio() function (at least not that I could
> find). It's all completely separate from existing audio functionality.


MediaElementSourceNode takes Audio or Video elements as a constructor
argument, if I've understood correctly.


>
> >> >>
> >> >> And finally, the Web Audio API only
> >> >> implements a certain set of audio manipulation functions in C/C++ -
> if
> >> >> a developer needs more flexibility, they have to use the JavaScript
> >> >> way here, too.
> >> >
> >> > This is true, but I think the set of functions will be useful in a
> large
> >> > set
> >> > of applications.  They can use custom JavaScript processing in special
> >> > cases.
> >>
> >>
> >> There is no doubt. I agree that these functions are useful and it will
> >> be very important to have them in C/C++ and be able to build a filter
> >> graph. What I'm trying to achieve is fairness in the discussion
> >> between the two APIs and the realization that both approaches are
> >> important to achieve.
> >
> > I agree that both approaches are important, but I think you're unfairly
> > glossing over the concrete work I've done to integrate the "processing
> > directly in JavaScript" paradigm into the Web Audio API.  Both approaches
> > are important and useful and I have working demos illustrating both types
> of
> > processing.
>
> No I'm not glossing over it. It's that piece of work that has actually
> made the functionality equal in both APIs. But I do think there is
> room for improvement of the API and integration with HTML5.


Agreed, there is always room for improvement.


>
> >> >>
> >> >> In my opinion, the difference between the Web Audio API and the Audio
> >> >> Data API is very similar to the difference between SVG and Canvas.
> The
> >> >> Web Audio API is similar to SVG in that it provides "objects" that
> can
> >> >> be composed together to create a presentation. The Audio Data API is
> >> >> similar to Canvas in that it provides pixels to manipulate. Both have
> >> >> their use cases and community. So, similarly, I would hope that we
> can
> >> >> get both audio APIs into HTML5.
> >> >
> >> > I've tried to incorporate the features of the Audio Data API into the
> >> > Web
> >> > Audio API with the introduction of JavaScriptAudioNode
> >> > and MediaElementAudioSourceNode.  So, in a sense I believe we already
> >> > have
> >> > the required features which you desire.
> >>
> >> Working with the API I have felt it clunky and not quite integrated
> >> with the existing HTML5 specification yet, when in contrast the Web
> >> Audio API has extended the Audio() element with a few extra fields and
> >> an event to make it all happen. I believe there would be a better way
> >> to take a similar approach where we don't actually need an
> >> AudioContext() and the Audio() element already creates an
> >> AudioContext(). That would make the API a lot more elegant and would
> >> remove some replication.
> >
> > Well, I can't argue with your personal opinion about how the API felt to
> you
> > :)
> > But, I don't think that everything audio-related needs to be jammed into
> the
> > <audio> tag.  Its API was not designed from the ground up to handle these
> > more advanced use cases.  There are a whole pantheon of graphics-related
> DOM
> > elements and APIs serving different purposes.  They don't all have to be
> > intimately involved with an <img> tag.
>
> Agreed. It should be carefully looked at what the best API design is
> and whether it makes sense to hook into existing features or replicate
> them. For graphics, it made sense to replicate pixel handling in a
> different element.
>
> > Similarly, I don't believe
> > everything audio-related needs to be pushed into the <audio> tag which
> was,
> > after all, designed explicitly for audio streaming.
>
> No I don't think that's the case. Audio() has been created for
> displaying audio content on Web pages, no matter where it comes from.
> The Audio Data API has in fact proven that it can be easily extended
> to also deal with sound input and output on a sample level.
>

That is true, but we shouldn't do something just because we could. Just like
Video element doesn't have separate Audio elements inside it for audio, I
believe in my humble opinion that the AudioContext is the right place for
this API since it interacts with both Video and Audio and does not belong as
a part of either Video or Audio, just like Canvas doesn't belong in Video or
Image. I don't think we want to clutter up the specifications and slow down
the standardization process by forcing such.


>
> > Believe me, I've looked
> > carefully at the <audio> API and believe I've achieved a reasonable level
> of
> > integration with it through the MediaElementSourceNode.  It's practical
> and
> > makes sense to me.  I think this is just one area where we might
> disagree.
>
> Maybe. But good design comes from trying to discuss the advantages and
> disadvantages of different approaches and I must admit I have not seen
> much discussion here about possible alternative design approaches. I'd
> like to encourage the group to keep an open mind and experiment with
> possible other viewpoints and design approaches.
>

Spot on, I would also encourage anyone planning to try out the Web Audio API
also try out the Audio Data API, and am personally a huge fan of both David
Humphrey's and Chris' work.


>
> Regards,
> Silvia.
>
>
Now that's all off my chest, I'd like to open a discussion for something I
deem important for the future of audio web apps. Currently, I haven't seen
either of the existing APIs to have a feature that allows one to easily and
cleverly save the audio generated. And by clever, I mean not storing in an
array and outputting a wav data url, I mean, the encoders/decoders are
already there, I'd like for them to be usable as well.

Something like getDataUrl similar to canvas is needed, and it should take
arguments for output format preferences. I haven't looked at Chris' source
code very specifically, but I don't see introducing something of a recording
interface too complicated. An idea I've been playing with is that Audio
element would have the mentioned toDataUrl() and AudioContext would have
something like createNewRecordingInterface() to create an interface that has
methods such as record(), stop() and save(), of which the latter would
create an Audio element which could then be handled in a way wanted (played
back as a normal Audio element, or extracted as a data url).

Best Regards,
Jussi Kalliokoski
Web Developer,
Aldebaran
Received on Wednesday, 2 February 2011 05:55:46 UTC