Re: Adding Web Audio API Spec to W3C Repository from Paul Bakaus on 2011-06-14 (public-audio@w3.org from April to June 2011)

From: Paul Bakaus <pbakaus@zynga.com>
Date: Tue, 14 Jun 2011 02:39:23 -0700
To: Doug Schepers <schepers@w3.org>, "robert@ocallahan.org" <robert@ocallahan.org>, "public-audio@w3.org" <public-audio@w3.org>
CC: Francois Daoust <fd@w3.org>, Philippe Le Hegaret <plh@w3.org>, "Michael(tm) Smith" <mike@w3.org>, Dan Burnett <dburnett@voxeo.com>, Dominique Hazael-Massieux <dom@w3.org>, Tobie Langel <tobie@fb.com>, Christoph Martens <cmartens@zynga.com>
Message-ID: <CA1CF7A8.9AFF%pbakaus@zynga.com>
Hi Doug,

Looping in Christoph Martens, who is working on our audio implementations.

Our focus is very much real world problems right now - we want to be able
to play, pause, stop and cache sounds. That's all for now - we'd be super
super happy if we'd have this today but we don't. I'm letting Chris chime
in on future sound specs, but to me, it's very clear we need tighter
control on device dependent implementations - I.e., a explicit section on
media caching ("This spec is only 100% implemented when audio media is
properly cached in cache manifests").

Thanks,
Paul

Am 10.06.11 18:27 schrieb "Doug Schepers" unter <schepers@w3.org>:

>Hi, ROC-
>
>(apologies in advance for the long, rambling email)
>
>Robert O'Callahan wrote (on 6/9/11 5:46 PM):
>> We (Mozilla) definitely plan to put forward a new spec that builds on
>> the HTML Streams proposal. I would like to make more progress on the
>> implementation before we do that, but if you think otherwise, we can go
>> forward.
>
>I'm torn.  On the one hand, I want to "get it right", which means
>implementation experience.  On the other, if we are to have a reasoned
>counterpoint to Chris Rogers' fairly mature spec and implementation, we
>need to have a starting point for technical comparisons and conversations.
>
>I think I would prefer to see some spec text, even knowing that it might
>change dramatically during implementation; that said, I acknowledge that
>that is extra work, though it may be a useful step to you as
>implementers to solidify your scope.
>
>
>> I believe the concerns I raised about synchronization and the
>> relationship with the Streams proposal that I raised in the earlier
>> thread are still valid, but that thread was a tennis match between me
>> and Chris and I'd like to hear from W3C people and other parties
>> (especially HTML and Streams people) how they feel about those concerns.
>
>I read the conversation [1] with interest, but didn't feel qualified to
>respond on more than a superficial level.  I agree with you that having
>compatible integration of these various use cases is a strong goal, but
>I wasn't convinced that they needed to be merged per se; I was more
>convinced by the argument that there should be hooks between them, but
>that they should be developed as stand-alone APIs, both to better fit
>their own requirements and audiences, and to allow them to be developed
>and extended at their own paces (not least for speedy progress towards a
>first Recommendation that is widely implemented, so we can move on to v2
>with more author experience).
>
>For broader review and discussion, I've CCed Francois Daoust (staff
>contact for the Real-Time Communications WG), Dom Hazael-Massieux (staff
>contact for the Device APIs WG), Mike Smith (staff contact for the HTML
>WG), and Philippe Le Hégaret (Interaction Domain Lead); I've also CCed
>Dan Burnett, co-chair of the Voice Browser WG (which does VoiceXML) and
>chair of the HTML Speech Incubator Group.  I've also CCed Paul Bakaus of
>Zynga and Tobie Langel of Facebook, who have an interest in audio for
>HTML5 games and user interfaces.
>
>I'd like them to solicit their opinions, and to suggest that they cast
>about for people in their own groups or circles who could chime in on a
>more technical level about the audio and media streams, and the
>relationship between the RTC and audio manipulation use cases.
>
>
>> As a veteran of decade-long efforts to resolve conflicts between specs
>> that never should have happened in the first place (SVG, CSS and HTML,
>> I'm looking at you), I think it's worth taking time to make sure we
>> don't have another Conway's law failure.
>
>I am with you there.  As you know, I have always advocated for closer
>integration of these technologies, from the failed effort in the
>Compound Documents WG to the more successful FX Task Force.  I don't
>think that this is the problem here... there are plenty of people
>talking to one another, and genuine open minds, but there is also a
>reasonable technical case for a degree of separation.
>
>
>>The immediate demand for audio
>> API will have to be (and is being) satisfied by libraries that abstract
>> over browser differences, and that will remain true for quite some time
>> no matter what the WG does.
>
>I can read this two ways (I don't know which way you meant it... maybe
>some third way?):
>
>1) "Script libraries will help build audience-appropriate abstraction
>layers that make whatever the Audio WG does better fit the needs of that
>particular audience"; or
>
>2) "We are going to do our own audio API our way, and ignore what is
>being done by the Audio WG or the other implementers."
>
>I agree with the motivations behind #1, but am concerned about the
>sentiments behind #2.  Having a single API that enjoys consensus by
>different implementers across platforms and devices is strong motivation
>for others implementers to get on board, and makes it easier for them to
>justify their investment of energy, because the benefits to developers
>and users are profound.  Having divergent and competing APIs might seem
>like an evolutionary sound "survival of the fittest" approach, but I'm
>not convinced that it will produce a timely best-of-breed hybrid that
>could be achieved by simply bringing in the right stakeholders and
>learning for their experience...  it also seems like a Conway's Law
>failure at an inter-organizational level.
>
>
>Regarding the script library emphasis, I know that Corban Brook's
>audionode.js and Grant Galitz's XAudioJS are good (if incomplete)
>emulation layers, but do we have benchmarks that show how well they
>scale to multiple audio instances (such as used in games)?  I can't
>shake the feeling that native code will continue to outperform emulation
>via script, and I'd be more comfortable if we had some hard data.
>
>
>[1] http://lists.w3.org/Archives/Public/public-audio/2011AprJun/0004.html
>
>Regards-
>-Doug Schepers
>W3C Staff Contact, SVG, WebApps, Web Events, and Audio WGs
Received on Tuesday, 14 June 2011 09:40:08 UTC