Re: Web Audio API is now available in Chrome from Olli Pettay on 2011-02-02 (public-xg-audio@w3.org from February 2011)

From: Olli Pettay <Olli.Pettay@helsinki.fi>
Date: Wed, 02 Feb 2011 11:16:37 +0200
To: Jussi Kalliokoski <jussi.kalliokoski@gmail.com>
CC: Silvia Pfeiffer <silviapfeiffer1@gmail.com>, Chris Rogers <crogers@google.com>, "Tom White (MMA)" <lists@midi.org>, public-xg-audio@w3.org
Message-ID: <4D4920F5.9040806@helsinki.fi>
On 02/02/2011 10:06 AM, Jussi Kalliokoski wrote:
> On Wed, Feb 2, 2011 at 9:27 AM, Silvia Pfeiffer
> <silviapfeiffer1@gmail.com <mailto:silviapfeiffer1@gmail.com>> wrote:
>
>     On Wed, Feb 2, 2011 at 4:55 PM, Jussi Kalliokoski
>     <jussi.kalliokoski@gmail.com <mailto:jussi.kalliokoski@gmail.com>>
>     wrote:
>      > Hi, having worked with only Audio Data API so far, but yet having
>     read the
>      > specification for Web Audio API, I'll jump in.
>      >
>      > On Wed, Feb 2, 2011 at 2:30 AM, Silvia Pfeiffer
>     <silviapfeiffer1@gmail.com <mailto:silviapfeiffer1@gmail.com>>
>      > wrote:
>      >>
>      >> On Wed, Feb 2, 2011 at 11:06 AM, Chris Rogers
>     <crogers@google.com <mailto:crogers@google.com>> wrote:
>      >> >
>      >> >
>      >> >> > The Web Audio API *does* interact with the <audio> tag.
>       Please see:
>      >> >> >
>      >> >> >
>      >> >> >
>     http://chromium.googlecode.com/svn/trunk/samples/audio/specification/specification.html#MediaElementAudioSourceNode-section
>      >> >> > And the diagram and example code here:
>      >> >> >
>      >> >> >
>      >> >> >
>     http://chromium.googlecode.com/svn/trunk/samples/audio/specification/specification.html#DynamicLifetime-section
>      >> >> > To be fair, I don't have the MediaElementSourceNode
>     implemented yet,
>      >> >> > but
>      >> >> > I
>      >> >> > do believe it's an important part of the specification.
>      >> >>
>      >> >> None of this hooks into the <audio> element and the existing
>     Audio()
>      >> >> function of HTML5: see
>      >> >>
>      >> >>
>      >> >>
>     http://www.whatwg.org/specs/web-apps/current-work/multipage/video.html#audio
>      >> >> . It creates its own AudioNode() and  AudioSourceNode(). This
>     is where
>      >> >> I would like to see an explicit integration with HTML5 and not a
>      >> >> replication of functionality.
>      >> >
>      >> > I'm not sure what your point is.  MediaElementSourceNode has a
>     very
>      >> > direct
>      >> > relationship  uses an <audio> element.
>      >>
>      >> They are all subclasses of AudioNode(), not of Audio(). You just
>     have
>      >> to look at your implementation examples. There is nowhere an <audio>
>      >> element or a call to the Audio() function (at least not that I could
>      >> find). It's all completely separate from existing audio
>     functionality.
>      >
>      > MediaElementSourceNode takes Audio or Video elements as a constructor
>      > argument, if I've understood correctly.
>
>
>     I wonder how this should work. I haven't seen an example and I have
>     created example programs with both APIs. Maybe Chris can provide some
>     example code so it becomes clear.
>
>
>      >> > Similarly, I don't believe
>      >> > everything audio-related needs to be pushed into the <audio>
>     tag which
>      >> > was,
>      >> > after all, designed explicitly for audio streaming.
>      >>
>      >> No I don't think that's the case. Audio() has been created for
>      >> displaying audio content on Web pages, no matter where it comes
>     from.
>      >> The Audio Data API has in fact proven that it can be easily extended
>      >> to also deal with sound input and output on a sample level.
>      >
>      > That is true, but we shouldn't do something just because we
>     could. Just like
>      > Video element doesn't have separate Audio elements inside it for
>     audio, I
>      > believe in my humble opinion that the AudioContext is the right
>     place for
>      > this API since it interacts with both Video and Audio and does
>     not belong as
>      > a part of either Video or Audio, just like Canvas doesn't belong
>     in Video or
>      > Image. I don't think we want to clutter up the specifications and
>     slow down
>      > the standardization process by forcing such.
>
>     Are you aware that you can use the Audio Data API both for <audio> and
>     <video> elements? Also, I don't think that would slow down the
>     standardization process - in fact, it will be a big question asked why
>     one interface has managed to hook into existing elements, while
>     another needs a completely separate and JavaScript-only API. You could
>     almost say that the Web Audio API doesn't use any HTML at all and
>     therefore doesn't actually need to go into the HTML spec.
>
>
> Yes, and I agree partly, it's a very handy thing to have to bind
> processing events to existing Audio and Video elements, and Audio Data
> API's approach to this is very straightforward, sensible and usable. It
> is true that more integration is in place regarding this.
>
>
>
>      >> > Believe me, I've looked
>      >> > carefully at the <audio> API and believe I've achieved a
>     reasonable
>      >> > level of
>      >> > integration with it through the MediaElementSourceNode.  It's
>     practical
>      >> > and
>      >> > makes sense to me.  I think this is just one area where we might
>      >> > disagree.
>      >>
>      >> Maybe. But good design comes from trying to discuss the
>     advantages and
>      >> disadvantages of different approaches and I must admit I have
>     not seen
>      >> much discussion here about possible alternative design
>     approaches. I'd
>      >> like to encourage the group to keep an open mind and experiment with
>      >> possible other viewpoints and design approaches.
>      >
>      > Spot on, I would also encourage anyone planning to try out the
>     Web Audio API
>      > also try out the Audio Data API, and am personally a huge fan of
>     both David
>      > Humphrey's and Chris' work.
>
>     Couldn't agree more. I would also like to see proof of the claims that
>     latency is a problem in one interface and not the other on all major
>     OS platforms, so I am looking forward to seeing the Windows and Linux
>     releases of Google Chrome using the Web Audio API.
>
>
> Having tried Audio Data API on multiple platforms, I have to admit that
> the latency point is quite valid. More complex things, such as my
> experiment on modular synthesis with Audio Data API run very poorly on
> most of the older laptops (older being more than 2 years old) and mini
> laptops. However, this is partly due to DOM and drawing operations
> taking priority over the audio processing, and as these speed up, the
> results will be better.

This is a reason for https://bugzilla.mozilla.org/show_bug.cgi?id=615946
Audio processing could happen in those cases in a background thread.



  The audio handling by itself isn't really that
> slow, in fact, the most common audio processing methods such as filters
> and FFTs have proven to be very slightly different in performance from
> C++ code doing the same thing. Of course, it's impossible for JS to
> defeat optimized Assembly in performance, no one can argue with that.
> However, on mobile devices, such as my HTC Desire HD, which is
> relatively fast, the JS audio processing performance on Fennec is
> absolutely horrid, we can't even use the word performance. Even
> something as simple as a square test tone produces awful clicks and pops
> constantly, if it plays at all.
>
> However, we should be hopeful for our prayers will soon be answered as
> the processing power on computers is likely to multiply due to grafene
> innovations. But we should still make things as performant as possible,
> having a 300 times more power doesn't mean the programs we make should
> use 300 times more power (a silly thing to say, we all know this. "640k
> ought to be enough for everybody").
>
>
>      > Now that's all off my chest, I'd like to open a discussion for
>     something I
>      > deem important for the future of audio web apps. Currently, I
>     haven't seen
>      > either of the existing APIs to have a feature that allows one to
>     easily and
>      > cleverly save the audio generated. And by clever, I mean not
>     storing in an
>      > array and outputting a wav data url, I mean, the
>     encoders/decoders are
>      > already there, I'd like for them to be usable as well.
>      > Something like getDataUrl similar to canvas is needed, and it
>     should take
>      > arguments for output format preferences. I haven't looked at
>     Chris' source
>      > code very specifically, but I don't see introducing something of
>     a recording
>      > interface too complicated. An idea I've been playing with is that
>     Audio
>      > element would have the mentioned toDataUrl() and AudioContext
>     would have
>      > something like createNewRecordingInterface() to create an
>     interface that has
>      > methods such as record(), stop() and save(), of which the latter
>     would
>      > create an Audio element which could then be handled in a way
>     wanted (played
>      > back as a normal Audio element, or extracted as a data url).
>
>     That's a very good point. However, I always thought that the device
>     element is meant for such things, see
>     http://www.whatwg.org/specs/web-apps/current-work/complete/commands.html#devices
>     . There is a Stream API which allows you to provide a URL and a
>     record() and stop() function. Maybe that can be hooked up to saving
>     audio and video data? A "toDataUrl()" function on audio and video
>     content would be nice, too, though. You could even implement
>     screencasting in an easy way thus.
>
>
> Yes, I've been eavesdropping that, and it seems to be the way it is
> planned to be, but I think we need both, we also want a way that doesn't
> require the usage of the device element. The simpler we make simple
> things, the better.
>
>
>      > Best Regards,
>      > Jussi Kalliokoski
>      > Web Developer,
>      > Aldebaran
>
>     Silvia.
>
>
Received on Wednesday, 2 February 2011 09:17:19 UTC