Re: Proposal for fixing race conditions from Srikumar Karaikudi Subramanian on 2013-07-18 (public-audio@w3.org from July to September 2013)

From: Srikumar Karaikudi Subramanian <srikumarks@gmail.com>
Date: Thu, 18 Jul 2013 16:53:43 +0530
To: Chris Rogers <crogers@google.com>
Cc: "K. Gadd" <kg@luminance.org>, Jer Noble <jer.noble@apple.com>, Olivier Thereaux <Olivier.Thereaux@bbc.co.uk>, WG <public-audio@w3.org>
Message-Id: <C5615330-D2AE-4F3E-8A06-287291C21AE2@gmail.com>
> You're simplifying my position a bit.  

Perhaps ...

> What I'm saying is there are no sensible or normal calling patterns where this type of race conditions is even a possibility.  As the API is designed and has been used for over 2 years, these calling patterns are not used and so simply are not an issue.  We do have substantial developer experience to support this view, and these developers come from a wide range of backgrounds and experience levels from complete novices playing with audio for the first time, all the way to seasoned professional audio developers.

.. but the ignorability of nonsensical and abnormal calling patterns depends on the impact that any race condition possibility in the API can have. If, for example, they can be used to compromise the security of the browser sandbox, say, because these buffers are hooking into high priority scheduling at the OS level, then wouldn't it be worth making even compatibility breaking changes or taking some constant malloc+memcpy hits to solve that? So it does seem to me that the fact that such calling patterns cannot cause harm beyond garbled audio output is the key behind this argument. 

The benign-ness of the current race conditions seems (to me) likely, since the lengths of ArrayBuffers cannot be changed. If their lengths can be changed, the current API might permit buffer overrun exploits (still only guessing here). Even if they can't be changed, there could exist small windows of vulnerability in an implementation where the length of an array is kept around while an updated pointer to the array data is taken. 

So, we ought to first take race conditions seriously and at least prove that they cannot cause serious harm before deciding they are not a problem in a design.

Another point - irrespective of experience, devs do copy/paste/borrow already available code patterns especially when they're experimenting with something new and want to understand stuff (myself included). So I'd venture a guess that the scale of the evidence is smaller than it appears.

Best,
-Kumar

On 18 Jul, 2013, at 12:11 AM, Chris Rogers <crogers@google.com> wrote:

> 
> 
> 
> On Tue, Jul 16, 2013 at 7:24 PM, Srikumar Karaikudi Subramanian <srikumarks@gmail.com> wrote:
>> Your hypothetical test case merely demonstrates the difference; my point is that it is silly to optimize for imaginary edge cases at the cost of real-world use cases where developers will get unexpected results due to leaving race conditions in this API. I should also note that it has come up in past discussions that we could always introduce new no-copy APIs that don't contain races, if the cost of memcpy is so severe.
> 
> It is not inconceivable to make an audio editor which plays an audio file from a specific sample onwards by assigning the buffer to an AudioBufferSourceNode and using start(t,offset,duration) ... possibly followed by effects. Large files (even 5mins?) would be unusable with such an editor if a copy were involved and clients/devs will be forced to do crazy optimizations just to get it to work. Now shift that situation to an iPad with limited memory and it can get worse. DAWs are a use case for the API.
> 
> With Jer's example code, it would be possible to simulate such a (reasonable) case.
> 
> What might, I think, be acceptable is a one-time copy provided the copy can be reused without additional cost. As far as I can see, immutable data structures are the best candidates to solve the race conditions.
> 
> That said, I do find the argument (I think Rogers') that the worst thing that can happen with these race conditions is unexpected audio output and hence they are not very important an interesting stand.
> 
> You're simplifying my position a bit.  What I'm saying is there are no sensible or normal calling patterns where this type of race conditions is even a possibility.  As the API is designed and has been used for over 2 years, these calling patterns are not used and so simply are not an issue.  We do have substantial developer experience to support this view, and these developers come from a wide range of backgrounds and experience levels from complete novices playing with audio for the first time, all the way to seasoned professional audio developers.
> 
> Chris
> 
>  
> 
> -Kumar
> 
> On 17 Jul, 2013, at 7:13 AM, "K. Gadd" <kg@luminance.org> wrote:
> 
>> Of course you can claim hypothetical performance benefits from any particular optimization, my point is that in this case we're considering whether or not to leave *race conditions* in a new Web API because we think it might make it faster. We *think* it *might*. Making that sort of sacrifice in favor of 'performance' without doing any reproducible, remotely scientific testing to see whether it's actually faster, let alone fast enough to justify the consequences, seems rash to me.
>> 
>> It should be quite easy to test the performance benefits of the racy version of the API, as based on my understanding the Firefox implementation currently makes copies. You need only run your test cases in Firefox with SPS and see how much time is spent making calls to memcpy to get a rough picture of the actual overhead. And once you know that, you can look at how your test cases actually perform and see if the cost of that memcpy makes it impossible to ship an implementation that makes those copies.
>> 
>> I am literally unable to imagine a use case where the cost of the copies would add up to the point where it would remotely be considered a bottleneck. It is the case that the copies probably have to be synchronous, so I could see this hurting the ability to trigger tons and tons of sounds in a single 'frame' from JS, or set tons and tons of curves, etc. But still, memcpy isn't that slow, especially for small numbers of bytes.
>> 
>> Your hypothetical test case merely demonstrates the difference; my point is that it is silly to optimize for imaginary edge cases at the cost of real-world use cases where developers will get unexpected results due to leaving race conditions in this API. I should also note that it has come up in past discussions that we could always introduce new no-copy APIs that don't contain races, if the cost of memcpy is so severe.
>> 
>> 
>> On Tue, Jul 16, 2013 at 6:27 PM, Jer Noble <jer.noble@apple.com> wrote:
>> 
>> On Jul 16, 2013, at 1:18 PM, K. Gadd <kg@luminance.org> wrote:
>> 
>>> This claim has been made dozens of times now on the list and I've seen multiple requests for even a single test case that demonstrates the performance impact. Is there one? I haven't seen one, nor a comment to the effect that one exists, or an explanation of why there isn't one.
>> 
>> Isn't this self-evident?  Any solution which involves additional memcopy calls during the normal use of the API will have an inherant and known performance cost at the point of the memcopy.  Additionally, there is the ongoing performance cost of having duplicate, in-memory copies of audio data, as well as the additional GC cost of those extra copies.
>> 
>> That said, it would be very easy to demonstrate: in the hypothetical test case, create a new ArrayBuffer from source data before passing it into the API.  I.e.,
>> 
>>> sourceNode.buffer = buffer
>> 
>> becomes:
>> 
>>> sourceNode.buffer = buffer.slice(0)
>> 
>> -Jer
>> 
> 
>
Received on Thursday, 18 July 2013 11:24:35 UTC