Re: New proposal for fixing race conditions from Ehsan Akhgari on 2013-07-23 (public-audio@w3.org from July to September 2013)

From: Ehsan Akhgari <ehsan.akhgari@gmail.com>
Date: Tue, 23 Jul 2013 18:24:21 -0400
To: Chris Wilson <cwilso@google.com>
Cc: "Robert O'Callahan" <robert@ocallahan.org>, Marcus Geelnard <mage@opera.com>, Jer Noble <jer.noble@apple.com>, Russell McClellan <russell@motu.com>, WG <public-audio@w3.org>
Message-ID: <CANTur_6hn9sur7DkPK3gKQXqwO=RexyLJKBG5AD0TS3CpivcTg@mail.gmail.com>
On Tue, Jul 23, 2013 at 12:11 PM, Chris Wilson <cwilso@google.com> wrote:

> I've been hanging back from this discussion a bit, but I feel the need to
> express my own take (since I come at the API from a very different
> perspective than Chris).
>
> I understand (and support) Robert's initial introduction of this issue
> (first line of
> http://lists.w3.org/Archives/Public/public-audio/2013AprJun/0644.html) -
> we should avoid having internal implementation details affect observable
> output.  However, that's not the same thing as "we must prevent any
> possible race conditions" - in this case, the race condition is between the
> Web Audio "thread" and the main execution thread.  This is not so much
> about internal implementation details as it is about the fact that Web
> Audio developers need to have their expectations set around interactions
> with the audio "thread".
>

Let's first agree on the terminology we're using.  Let's define the
following terms as below, just so that everybody is on the same boat in
this discussion:

* Data race issue: the problem where one thread is reading to the same
memory buffer as another thread is reading from it.  Whether the reader
thread sees the old or new data depends on the timing of the two threads
being scheduled on the processor(s).
* Non-deterministic order of execution: If you have multiple asynchronous
tasks in progress, the order in which those operations occur is
non-deterministic.

I'm only focused on the first problem here, not at all on the second one.
If you meant data race issues in "we must prevent any possible race
conditions", I entirely agree.  If you meant the problems caused by
non-deterministic order of execution, I don't think we should do that.
That is an inherent property of the Web platform already.


> AFAICT, all the proposals made so far - Jer's included - put quite a heavy
> weight on interacting with audio buffer data. For the purposes of
> synthesizing my own audio data, this will require a memcpy.  In mobile
> scenarios, and desktop scenarios with large buffers of data (i.e. a DAW),
> this will put a significantly destructive additional burden on the
> environment required to play (and likely record/process) audio.  This seems
> like an awfully big deal to me, so I have to question - what's the benefit?
>  It is not to my knowledge required to avoid crashes or other potential
> security issues; the only downside is if an author modifies a playing audio
> buffer, they could get differing playback results depending on precise
> timing.
>

Yes, and note that this is an instance of a "data race issue", not a result
of non-deterministic execution order.  With this problem present, a
developer may unintentionally write code which works fine even in multiple
browsers on the device he's testing on, but behaves highly differently on
other devices/browsers.  A very good example of what can make a huge
difference here is whether the machine has a single or multiple execution
cores.  Back in the day where multi-core machines were still a niche, this
uses to be a classic example of bugs in multi-threaded applications only
showing up on milti-core machines.  I would like to argue that we
absolutely do not want to expose this kind of problem to the Web platform,
and other parts of the platform have been carefully designed to work around
this.


> That doesn't seem any different, to me, than what happens with small
> timing differences in event delivery today, or setting audio times that are
> too close to "now"
>

Now you're talking about the second class of problems.  These two problems
are very different in nature.  I refer you to an analogy that Robert (I
think?) provided earlier in this thread, that of it being impossible for
you to read partly updated data back from a video element today on a
canvas.  While you may not be sure which frame of the video you're getting
(non-deterministic execution order makes that impossible to predict), you
always know that you're only ever going to get a full frame out of the
video element, not part of the current frame and part of the next frame
(since we are providing guarantees against the data race issues.)  I hope
this analogy can help make the difference in the two problems evident.


> - if you want the full power of the audio system, you have to learn how to
> work closely with the system and adapt to environments.
>

Yes, and with an API which does memcpy's _some of the time_, it's crucial
to give web developers a set of best practices that they can use.  Note
that none of the current proposals for AudioBuffer are proposing an API
which requires a copy all the time.  The situation with AudioParam and
WaveShaper node curves is not clear yet.


> As Chris pointed out, there is some experience working with the API as it
> is today, and I haven't heard of (or personally experienced) any problems
> traced to this issue.
>

Yes, point taken.  But please note that the software industry has no
shortage of evidence about what types of problems the data race issue can
introduce.  There's quite a bit of literature about it, and in all
multi-threading libraries there are synchronization primitives and best
practices on how to avoid this problem.


> Also, correct me if I'm mistaken, but I don't believe this is equal to
> "browser x will operate differently than browser y" - timing is everything
> in this scenario anyway, and actually even Jer's proposal could enable
> different behavior across browsers/environments, it would just be replacing
> the entire buffer instead of a portion.
>

Yes, this difference in behavior is caused by the non-deterministic
execution order.  But replacing portion of an AudioBuffer is another
problem of its own (a data race issue.)  The Web platform doesn't guarantee
execution order already, but does guarantee the absence of data races.


> I feel designing the API around prevent race conditions everywhere is 1)
> ultimately not going to be successful anyway, and 2) is like wrapping
> everything with bubble wrap.  It will prevent some minor bruises, but it
> will also make it quite a bit more costly (in memory and time) to get the
> tasks needed done.
>

I think here you're talking about designing an API to prevent both kinds of
problems.  I agree with that, but that is not what we're currently
debating.  If we're only talking about the data race issue, we know that 1)
it is going to be successful, as demonstrated by other Web APIs, and 2) it
does protect against serious problems such as vastly different memory
models and speeds on different devices that people use to access the Web
today and in the future.


> Olivier, to answer your question, I believe this would currently be an
> Objection.
>

I hope that the above would help convince you to reconsider this
Objection.  Please let me know if you believe that I have missed part of
your argument, or if you disagree with what I have written above.

Thanks!
--
Ehsan
<http://ehsanakhgari.org/>



> -Chris
>
>
> On Tue, Jul 23, 2013 at 8:07 AM, Ehsan Akhgari <ehsan.akhgari@gmail.com>wrote:
>
>> On Mon, Jul 22, 2013 at 6:44 PM, Robert O'Callahan <robert@ocallahan.org>wrote:
>>
>>> On Tue, Jul 23, 2013 at 5:44 AM, Marcus Geelnard <mage@opera.com> wrote:
>>>
>>>> My guess is that this is very similar to the current solution in gecko
>>>> (Ehsan?).
>>>>
>>>
>>> It's close to what we do. We neuter and recycle the output buffers, but
>>> currently we don't neuter and recycle the input buffers. I think that's a
>>> good idea though.
>>>
>>
>> We also lazily create the input buffers when JS first accesses the
>> inputBuffer property, to optimize the case where the ScriptProcessorNode is
>> only used for synthesis, not as a filter.
>>
>> Cheers,
>> Ehsan
>>
>
>
Received on Tuesday, 23 July 2013 22:25:29 UTC