Re: Proposal for fixing race conditions from K. Gadd on 2013-07-18 (public-audio@w3.org from July to September 2013)

From: K. Gadd <kg@luminance.org>
Date: Thu, 18 Jul 2013 05:00:07 -0700
To: Chris Rogers <crogers@google.com>
Cc: Srikumar Karaikudi Subramanian <srikumarks@gmail.com>, Jer Noble <jer.noble@apple.com>, Olivier Thereaux <Olivier.Thereaux@bbc.co.uk>, WG <public-audio@w3.org>
Message-ID: <CAPJwq3VcsPZJUS_fwmc8PbtSu1b4rTiamYrZ2HieoonzMzrspA@mail.gmail.com>
I feel I have to take issue with this quote in particular, even though I
don't necessarily disagree with the conclusion you're drawing here:

"As the API is designed and has been used for over 2 years, these calling
patterns are not used and so simply are not an issue.  We do have
substantial developer experience to support this view, and these developers
come from a wide range of backgrounds and experience levels from complete
novices playing with audio for the first time, all the way to seasoned
professional audio developers."

If two years of use and experience were sufficient, one could have assumed
that HTML would never contain images (IIRC <img> was added 2-3 years after
the original development/specification of HTML). The web platform has
evolved in (often unpredictable) steps over time and will continue to
evolve. I do not think it is reasonable to argue that 2 years of the Web
Audio API's availability - and we should be clear that in this context
'availability' simply means 'we shipped a prefixed version of an API
resembling this one, in one browser' - is sufficient to identify and
predict any potential issues with the specification, in any area. In an
area like thread safety (not to mention developer friendliness,
predictability of results, etc.) I think additional caution is always
merited.

If you simply wish to argue that the effects of these races can be proven
to always be limited to corrupt audio output, that diminishes the risk
significantly, so we're probably fine. However, is corrupt audio output
actually okay? Out of all the users who've been using the API for 2 years,
how many of them would consider it acceptable if end-users heard garbled
noise on some machines/configurations or in certain use cases? As I pointed
out, the presence of OfflineAudioContext effectively guarantees that at
some point a developer will use Web Audio to generate samples for offline
use, and rely on the quality of the results. At that point, I think it is
impossible to argue that corrupt audio is acceptable, especially if it is
intermittent depending on thread timing. The <audio> implementation
problems Chrome has had in the past serve as a useful guidepost here, if we
observe the way developers and end-users reacted to the corrupt audio
produced in those situations.

The question isn't 'will these races cause any harm'; at this point I think
we know what kind of harm they might cause. And likewise there is no
question *whether* memcpy has overhead, we know it does. The questions are
more like:

What sacrifices do we have to make to eliminate these races? Are those
sacrifices worth it? Can they be alleviated over time?
What sacrifices are we implicitly making if we leave these races in the
API? Can we predict all those sacrifices, by perfectly anticipating every
future use case of this API, 5-10 years in the future?
What sacrifices will we have to make if our goal is for Web Audio to be as
fast as possible? Will those sacrifices prevent us from making it safe to
use, easy to use, or reliable?

Keeping in mind the future, we also need to evaluate whether our answers
make sense for platforms we might not be paying attention to right now -
like Mobile Firefox and Chrome for Android, or perhaps some other device
types that have even more significant resource limits - Google Glass? Smart
watches? Home automation devices?

I should also point out that if you give a developer an API full of race
conditions, it can be much harder for them to ensure that they will not hit
the races. They don't have access to the internals, they can't put locks in
the right places - you end up, as I have, tiptoeing around the parts of
internals that you understand, trying to do things in particular orders
such that you will hopefully be safe from races. That is not a good place
to be, especially given that the implementation details of a given browser
can change under you with a single update.

In comparison, if we give developers an API that has some measurable
performance overhead due to copies, they can identify those copies and
reliably mitigate them the same way developers have been mitigating
performance issues for decades: by optimizing. This is something developers
already know how to do. If we give them new variants of the APIs that
*doesn't* rely on copies and *doesn't* contain races, then optimizing
becomes as simple as reading the spec and making the appropriate changes.
Developers who are not bottlenecked by that overhead will literally never
have to think about it; users will not file bug reports about a music
player consuming 2MB of additional memory. They *will* file bug reports if
playback glitches.

-kg

On Wed, Jul 17, 2013 at 11:41 AM, Chris Rogers <crogers@google.com> wrote:

>
>
>
> On Tue, Jul 16, 2013 at 7:24 PM, Srikumar Karaikudi Subramanian <
> srikumarks@gmail.com> wrote:
>
>> Your hypothetical test case merely demonstrates the difference; my point
>> is that it is silly to optimize for imaginary edge cases at the cost of
>> real-world use cases where developers will get unexpected results due to
>> leaving race conditions in this API. I should also note that it has come up
>> in past discussions that we could always introduce new no-copy APIs that
>> don't contain races, if the cost of memcpy is so severe.
>>
>>
>> It is not inconceivable to make an audio editor which plays an audio file
>> from a specific sample onwards by assigning the buffer to an
>> AudioBufferSourceNode and using start(t,offset,duration) ... possibly
>> followed by effects. Large files (even 5mins?) would be unusable with such
>> an editor if a copy were involved and clients/devs will be forced to do
>> crazy optimizations just to get it to work. Now shift that situation to an
>> iPad with limited memory and it can get worse. DAWs are a use case for the
>> API.
>>
>> With Jer's example code, it would be possible to simulate such a
>> (reasonable) case.
>>
>> What might, I think, be acceptable is a one-time copy provided the copy
>> can be reused without additional cost. As far as I can see, immutable data
>> structures are the best candidates to solve the race conditions.
>>
>> That said, I do find the argument (I think Rogers') that the worst thing
>> that can happen with these race conditions is unexpected audio output and
>> hence they are not very important an interesting stand.
>>
>
> You're simplifying my position a bit.  What I'm saying is there are no
> sensible or normal calling patterns where this type of race conditions is
> even a possibility.  As the API is designed and has been used for over 2
> years, these calling patterns are not used and so simply are not an issue.
>  We do have substantial developer experience to support this view, and
> these developers come from a wide range of backgrounds and experience
> levels from complete novices playing with audio for the first time, all the
> way to seasoned professional audio developers.
>
> Chris
>
>
>
>>
>> -Kumar
>>
>> On 17 Jul, 2013, at 7:13 AM, "K. Gadd" <kg@luminance.org> wrote:
>>
>> Of course you can claim hypothetical performance benefits from any
>> particular optimization, my point is that in this case we're considering
>> whether or not to leave *race conditions* in a new Web API because we think
>> it might make it faster. We *think* it *might*. Making that sort of
>> sacrifice in favor of 'performance' without doing any reproducible,
>> remotely scientific testing to see whether it's actually faster, let alone
>> fast enough to justify the consequences, seems rash to me.
>>
>> It should be quite easy to test the performance benefits of the racy
>> version of the API, as based on my understanding the Firefox implementation
>> currently makes copies. You need only run your test cases in Firefox with
>> SPS and see how much time is spent making calls to memcpy to get a rough
>> picture of the actual overhead. And once you know that, you can look at how
>> your test cases actually perform and see if the cost of that memcpy makes
>> it impossible to ship an implementation that makes those copies.
>>
>> I am literally unable to imagine a use case where the cost of the copies
>> would add up to the point where it would remotely be considered a
>> bottleneck. It is the case that the copies probably have to be synchronous,
>> so I could see this hurting the ability to trigger tons and tons of sounds
>> in a single 'frame' from JS, or set tons and tons of curves, etc. But
>> still, memcpy isn't that slow, especially for small numbers of bytes.
>>
>> Your hypothetical test case merely demonstrates the difference; my point
>> is that it is silly to optimize for imaginary edge cases at the cost of
>> real-world use cases where developers will get unexpected results due to
>> leaving race conditions in this API. I should also note that it has come up
>> in past discussions that we could always introduce new no-copy APIs that
>> don't contain races, if the cost of memcpy is so severe.
>>
>>
>> On Tue, Jul 16, 2013 at 6:27 PM, Jer Noble <jer.noble@apple.com> wrote:
>>
>>>
>>> On Jul 16, 2013, at 1:18 PM, K. Gadd <kg@luminance.org> wrote:
>>>
>>> This claim has been made dozens of times now on the list and I've seen
>>> multiple requests for even a single test case that demonstrates the
>>> performance impact. Is there one? I haven't seen one, nor a comment to the
>>> effect that one exists, or an explanation of why there isn't one.
>>>
>>>
>>> Isn't this self-evident?  Any solution which involves additional memcopy
>>> calls during the normal use of the API will have an inherant and known
>>> performance cost at the point of the memcopy.  Additionally, there is the
>>> ongoing performance cost of having duplicate, in-memory copies of audio
>>> data, as well as the additional GC cost of those extra copies.
>>>
>>> That said, it would be very easy to demonstrate: in the hypothetical
>>> test case, create a new ArrayBuffer from source data before passing it into
>>> the API.  I.e.,
>>>
>>> sourceNode.buffer = buffer
>>>
>>>
>>> becomes:
>>>
>>> sourceNode.buffer = buffer.slice(0)
>>>
>>>
>>> -Jer
>>>
>>
>>
>>
>
Received on Thursday, 18 July 2013 12:01:15 UTC