Re: Adding Web Audio API Spec to W3C Repository

Hi, ROC-

(apologies in advance for the long, rambling email)

Robert O'Callahan wrote (on 6/9/11 5:46 PM):
> We (Mozilla) definitely plan to put forward a new spec that builds on
> the HTML Streams proposal. I would like to make more progress on the
> implementation before we do that, but if you think otherwise, we can go
> forward.

I'm torn.  On the one hand, I want to "get it right", which means 
implementation experience.  On the other, if we are to have a reasoned 
counterpoint to Chris Rogers' fairly mature spec and implementation, we 
need to have a starting point for technical comparisons and conversations.

I think I would prefer to see some spec text, even knowing that it might 
change dramatically during implementation; that said, I acknowledge that 
that is extra work, though it may be a useful step to you as 
implementers to solidify your scope.


> I believe the concerns I raised about synchronization and the
> relationship with the Streams proposal that I raised in the earlier
> thread are still valid, but that thread was a tennis match between me
> and Chris and I'd like to hear from W3C people and other parties
> (especially HTML and Streams people) how they feel about those concerns.

I read the conversation [1] with interest, but didn't feel qualified to 
respond on more than a superficial level.  I agree with you that having 
compatible integration of these various use cases is a strong goal, but 
I wasn't convinced that they needed to be merged per se; I was more 
convinced by the argument that there should be hooks between them, but 
that they should be developed as stand-alone APIs, both to better fit 
their own requirements and audiences, and to allow them to be developed 
and extended at their own paces (not least for speedy progress towards a 
first Recommendation that is widely implemented, so we can move on to v2 
with more author experience).

For broader review and discussion, I've CCed Francois Daoust (staff 
contact for the Real-Time Communications WG), Dom Hazael-Massieux (staff 
contact for the Device APIs WG), Mike Smith (staff contact for the HTML 
WG), and Philippe Le Hégaret (Interaction Domain Lead); I've also CCed 
Dan Burnett, co-chair of the Voice Browser WG (which does VoiceXML) and 
chair of the HTML Speech Incubator Group.  I've also CCed Paul Bakaus of 
Zynga and Tobie Langel of Facebook, who have an interest in audio for 
HTML5 games and user interfaces.

I'd like them to solicit their opinions, and to suggest that they cast 
about for people in their own groups or circles who could chime in on a 
more technical level about the audio and media streams, and the 
relationship between the RTC and audio manipulation use cases.


> As a veteran of decade-long efforts to resolve conflicts between specs
> that never should have happened in the first place (SVG, CSS and HTML,
> I'm looking at you), I think it's worth taking time to make sure we
> don't have another Conway's law failure.

I am with you there.  As you know, I have always advocated for closer 
integration of these technologies, from the failed effort in the 
Compound Documents WG to the more successful FX Task Force.  I don't 
think that this is the problem here... there are plenty of people 
talking to one another, and genuine open minds, but there is also a 
reasonable technical case for a degree of separation.


>The immediate demand for audio
> API will have to be (and is being) satisfied by libraries that abstract
> over browser differences, and that will remain true for quite some time
> no matter what the WG does.

I can read this two ways (I don't know which way you meant it... maybe 
some third way?):

1) "Script libraries will help build audience-appropriate abstraction 
layers that make whatever the Audio WG does better fit the needs of that 
particular audience"; or

2) "We are going to do our own audio API our way, and ignore what is 
being done by the Audio WG or the other implementers."

I agree with the motivations behind #1, but am concerned about the 
sentiments behind #2.  Having a single API that enjoys consensus by 
different implementers across platforms and devices is strong motivation 
for others implementers to get on board, and makes it easier for them to 
justify their investment of energy, because the benefits to developers 
and users are profound.  Having divergent and competing APIs might seem 
like an evolutionary sound "survival of the fittest" approach, but I'm 
not convinced that it will produce a timely best-of-breed hybrid that 
could be achieved by simply bringing in the right stakeholders and 
learning for their experience...  it also seems like a Conway's Law 
failure at an inter-organizational level.


Regarding the script library emphasis, I know that Corban Brook's 
audionode.js and Grant Galitz's XAudioJS are good (if incomplete) 
emulation layers, but do we have benchmarks that show how well they 
scale to multiple audio instances (such as used in games)?  I can't 
shake the feeling that native code will continue to outperform emulation 
via script, and I'd be more comfortable if we had some hard data.


[1] http://lists.w3.org/Archives/Public/public-audio/2011AprJun/0004.html

Regards-
-Doug Schepers
W3C Staff Contact, SVG, WebApps, Web Events, and Audio WGs

Received on Friday, 10 June 2011 16:27:23 UTC