Re: Resolution to republish MSP as a note from Chris Rogers on 2012-08-09 (public-audio@w3.org from July to September 2012)

From: Chris Rogers <crogers@google.com>
Date: Thu, 9 Aug 2012 11:50:37 -0700
To: Jussi Kalliokoski <jussi.kalliokoski@gmail.com>
Cc: public-audio@w3.org
Message-ID: <CA+EzO0mgAG73ZPu68Vr5DA05kJFF59C5768oP_Y-00P+dfwQhA@mail.gmail.com>
On Thu, Aug 9, 2012 at 12:06 AM, Jussi Kalliokoski <
jussi.kalliokoski@gmail.com> wrote:

> Oops, sorry I lost the list on the way.
>
> On Thu, Aug 9, 2012 at 12:38 AM, Chris Rogers <crogers@google.com> wrote:
>
>>
>>
>> On Wed, Aug 8, 2012 at 1:52 PM, Jussi Kalliokoski <
>> jussi.kalliokoski@gmail.com> wrote:
>>
>>> On Wed, Aug 8, 2012 at 10:00 PM, Chris Rogers <crogers@google.com>wrote:
>>>
>>>>
>>>>
>>>> On Wed, Aug 8, 2012 at 11:21 AM, Jussi Kalliokoski <
>>>> jussi.kalliokoski@gmail.com> wrote:
>>>>
>>>>> On Wed, Aug 8, 2012 at 8:21 PM, Chris Rogers <crogers@google.com>wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Aug 8, 2012 at 7:35 AM, Jussi Kalliokoski <
>>>>>> jussi.kalliokoski@gmail.com> wrote:
>>>>>>
>>>>>>> On Wed, Aug 8, 2012 at 4:25 PM, Stéphane Letz <letz@grame.fr> wrote:
>>>>>>>
>>>>>>>>  >
>>>>>>>> > I'm probably badly misinformed, but the value of high priority
>>>>>>>> threads seems a bit vague to me, since I'm not sure about what's the OS
>>>>>>>> support level for high-priority threads, I think for example in Linux you
>>>>>>>> still have to compile your own kernel to get real high priority thread
>>>>>>>> support.
>>>>>>>>
>>>>>>>> No. You would possibly need a special kernel for very ¨*low
>>>>>>>> latency* thread scheduling, but not for RT scheduling and thread priority
>>>>>>>> management. A regular Linux kernel is now quite usable, assuming the audio
>>>>>>>> thread can take RT scheduling capability, which is given using Realtime Kit
>>>>>>>> in PulseAudio AFAICS or correctly setting a special "realtime" group with
>>>>>>>> appropriate values (see here for JACK:
>>>>>>>> http://jackaudio.org/realtime_vs_realtime_kernel and
>>>>>>>> http://jackaudio.org/linux_rt_config)
>>>>>>>>
>>>>>>>
>>>>>>> Thought I'd be misinformed! Thanks for the clarification, and sorry
>>>>>>> for the mixup.
>>>>>>>
>>>>>>>
>>>>>>>> On OSX real-time threads are actually "time constraints" threads,
>>>>>>>> that are going to preempt any other non RT thread and are "interleaved"
>>>>>>>> with other RT threads. The CoreAudio callback will run in a real-time
>>>>>>>> constraints started and configurated by the CoreAudio frameworks for the
>>>>>>>> audio application.
>>>>>>>>
>>>>>>>>
>>>>>>>> > And using high-priority threads might not always even be
>>>>>>>> desirable, for example in low-end devices it'd be horrible if the UI became
>>>>>>>> completely unusable because an audio thread was occupying the whole thread.
>>>>>>>>
>>>>>>>> But if not RT, then the audio will "glitch"... Do we want reliable
>>>>>>>> audio? or not?
>>>>>>>>
>>>>>>>
>>>>>>> I think you mean to ask "do we want audio in RT threads", because
>>>>>>> even that doesn't always warrant reliable audio nor does not having it
>>>>>>> exclude reliable audio. The answer to that question would be sometimes yes,
>>>>>>> sometimes not. Glitchless audio isn't worth much if the application becomes
>>>>>>> otherwise completely unusable. Is high-priority audio threads a feature
>>>>>>> that warrants for the complexity that comes with the native nodes?
>>>>>>> Especially given that we still have the possibility of RT thread workers
>>>>>>> open.
>>>>>>>
>>>>>>> I'm pretty sure that for example my Android phone doesn't run it's
>>>>>>> audio in a real-time thread, even networking connections can sometimes
>>>>>>> glitch the audio. But it's never bothered me, I'd actually rather have the
>>>>>>> UI in an RT thread like iOS does and have that always go before the audio
>>>>>>> and anything else for that matter. I'm pretty sure I'm not the only one.
>>>>>>>
>>>>>>
>>>>>> But many people have asked for improvements to the Android audio
>>>>>> performance and do not appreciate high-latency and glitches.  I know that
>>>>>> iOS *does* use high-priority threads and it works great for them, so your
>>>>>> argument seems to be rather weak.  Believe it or not, I think there will
>>>>>> actually be many people who are interested to process live audio in
>>>>>> real-time in web applications, or to play synthesizers using the MIDI API.
>>>>>>  Just because we've had terrible performance on the web with Flash, etc.
>>>>>> doesn't mean we have to stay in the stone age, lagging so far behind the
>>>>>> desktop audio applications abilities.
>>>>>>
>>>>>
>>>>> It wasn't really an argument, it was just my personal opinion. And I'm
>>>>> not suggesting we have bad performance, I'm suggesting a different approach
>>>>> at tackling performance issues. I agree that RT threads offer benefits in
>>>>> some cases, but some cases they don't and it should be up to the developer
>>>>> to decide what takes priority in his/her application. Hence I'd rather we
>>>>> try to get RT thread support for workers so that one can just decide
>>>>> whether to use a real-time thread or not by choosing the type of worker to
>>>>> use. If we had that, what on earth would be lagging behind desktop audio
>>>>> applications' abilities?
>>>>>
>>>>
>>>> But Jussi, I'm approaching the problem from the perspective of what is
>>>> possible to do today using well-known techniques, and not wishful thinking
>>>> of something which might be possible five years from now.
>>>>
>>>
>>> Wishful thinking, five years from now? That's a bit belittling don't you
>>> think? As Srikumar already pointed out, a lot of the meaningful performance
>>> things are happening already. And yes, getting this out there is one of my
>>> greatest concerns. Five years is likely to be closer to the time it would
>>> take to hope to have all the bugs filed against the current spec resolved
>>> and if we're lucky, even interoperable implementations. Getting RT thread
>>> support for workers is a significantly smaller uptaking and likely to
>>> extend its usefulness beyond audio, like the DSP API will.
>>>
>>
>> I think that getting RT thread support for workers is a very hard nut to
>> crack.  And this will take even longer if we have to wait for all
>> javascript runtimes to have these super-stringent capabilities.  Kumar
>> seems to share this view:
>>  "The web architecture may not permit the use of JS code in high
>> priority system audio callbacks for some time. That means the latency we
>> can get from native nodes is going to be better than JS for some time to
>> come."
>>
>
> Like Kumar seems to indicate, the biggest blocker is probably letting a
> website block a high priority thread might be problematic. But what's the
> difference when you can do the same using the native nodes anyway? If it's
> going to be a blocker for workers, it will be a blocker for web-audio as a
> whole. Regardless, the discussion has been opened now:
> http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2012-August/036850.html
>
>
>>
>>>
>>>> We simply don't have the level of technology in our JavaScript
>>>> runtimes, garbage collection, blocking calls, the taking of locks,
>>>> threading issues, etc., to deliver the kind of performance which people
>>>> expect, and will compare to desktop/native applications.  In the meantime,
>>>> people are asking for advanced audio features now.
>>>>
>>>> Because audio is deadline-driven, you always need to be concerned with
>>>> worst-case performance and not average case performance (for gc etc.)
>>>> Here's an interesting link which explores some of these issues:
>>>>
>>>> http://www.rossbencina.com/code/real-time-audio-programming-101-time-waits-for-nothing
>>>>
>>>> Chris
>>>>
>>>
>>> Yes, I've read that article, it's a good one. Funny you should reference
>>> it though, as you mention "the taking of locks, threading issues". The
>>> article explicitly advises against taking locks, mutexes, etc in the audio
>>> thread, not to mention running multiple threads of audio, so why would you
>>> need those for high-performance audio?
>>>
>>
>> I'm not sure where you got the impression that I was advocating doing all
>> the "bad things" this article advises against.  Avoiding the "bad things"
>> is what we pay attention to in our day-to-day engineering work.  My point
>> is that it's a lot easier to work under these conditions when JavaScript is
>> not working inside these threads.
>>
>
> I got the impression from you using the fact that we don't have mentioned
> technologies in our JS runtimes as an argument.
>
>
>>
>>
>>>
>>> Garbage collection. You keep bringing this up, so I might as well
>>> address it. If you had a dedicated audio worker, that would be completely
>>> isolated from the garbage collection issues in the main thread, so it's
>>> hard to at least try to justify that it's not a controllable issue. To
>>> avoid garbage collection, you need to just not produce any garbage. There
>>> are multiple approaches to this in JS, one is for example the one taken by
>>> emscripten, where they keep a single buffer from which everything is
>>> allocated, sort of a virtual memory. With this approach you'll produce
>>> virtually no garbage. The issue has already been addressed by others as
>>> well in the discussion about worker nodes.
>>>
>>
>> I'm aware of some of the painful extremes that people have gone to in
>> order to avoid garbage collection, including emscripten.  I find it hard to
>> believe that most people would want to involve themselves in this kind of
>> programming which is very difficult to integrate and almost impossible to
>> debug.  It also doesn't really solve the main thread to worker thread
>> latency, and other performance problems such as lack of multi-threading for
>> real-time convolution.
>>
>
> emscripten was a bit extreme an example, I admit. But it isn't much work
> to keep GC at a reasonable level by reusing allocated data etc., things you
> would do to optimize in static languages too anyway. And I should probably
> start calling myself the tautologist or something soon, but that's what
> libraries are made for. The end-developer, of whom we portray a sad picture
> (he's lazy, will do audio processing in the main-thread, etc.) hopefully at
> least knows how to include a JavaScript library. If not, (s)he probably
> shouldn't start with audio programming.
>
> The main thread to worker thread latency is mainly a problem with big data
> and even then the new postMessage() lets you transfer ownership which makes
> the latency much smaller.
>
> As for the multi-threading convolution, I believe this is the issue the
> DSP API is trying to tackle:
> http://people.opera.com/mage/dspapi/#filter-interface . I assume the idea
> is that the API implementation would be optimized to the bone, and JS need
> not really even bother what's happening underneath the hood, whether it's
> just SIMD or multiple threads as well.
>
>
>>
>>
>>>
>>> As for worst case performance, if we dump the native nodes then browser
>>> vendors will have at least a lot less things to keep optimizing for the
>>> worst case. ;)
>>
>>
>> We already have good, well-optimized, and open-source code in WebKit as
>> an example.  Even if browsers would choose to not directly use the code,
>> all of the information is available in plain sight for how to write the
>> code.  Certainly a lot easier than having to write a JavaScript library
>> with the same functionality from scratch using emscripten (assuming that
>> would even be possible).
>>
>
> That's reverse-engineering, and not really a good way to make standards,
> don't you think? It's just not how it works, there's a reason why W3C
> requires multiple independent&interoperable implementations, and that
> reason is to test that the spec can actually be implemented interoperably
> even if there is no reference implementation around.
>

I'm not saying it should be a substitute for improving and refining the
specification.  Of course we need to continue to work on that, and I'm
pleased to see the constructive feedback from this group so far.  We've
already incorporated some of those improvements recently and have just
published the 3rd public working draft.

What I meant was that the source-code in WebKit is strong proof that the
specification actually works in real implementations, addressing the use
cases for real-world audio applications.  Many developers have used the API
and have been able to create a wide range of games and applications
already.  Thus having the source-code available is a very good thing,
because it provides us with the confidence that we know how to create the
system that we're designing in the specification, and provides guidance for
improving the specification when put into the hands of real-world
developers.  No other alternative APIs or approach so far has come even
close to going through this critical process.



>
> Using emscripten would to port it would probably be a bad idea as for now
> it doesn't take advantage of the DSP API. Using the DSP API to write an
> audio framework would mainly be just writing an abstraction wrapper, the
> important functionality is already there, the library can just provide a
> meaningful way to use it.
>
>
>> I'm just trying to provide a simple-to-use, high-level audio API where
>> audio developers don't have to jump through hoops and where the JS calls
>> can be combined with other common JavaScript APIs that are available in the
>> main thread.  Is that such a bad thing?
>>
>
> No, it's not such a bad thing, absolutely not. I just think that it isn't
> in its right place as a web standard proposal, because it doesn't fit the
> big picture of the web as a whole, as it's not built on existing features
> nor does it really give any room for future web standards to build on it,
> it's just a separate entity that provides a few joins to communicate with
> the platform. Now that's OK, were we designing a user library, indeed I'd
> much rather see it as a library/framework that was built on a more
> reusable&modular lower-level API.
>

Saying something like "doesn't fit the big picture of the web as a whole"
is such a matter of opinion.  I would have to disagree and say that people
are using the Web Audio API today along with: HTMLMediaElement,
MediaStream, canvas2D, WebGL, WebSockets, CSS animations, XHR2, File API,
the DOM, Gamepad API, etc. etc.




>
> Cheers,
> Jussi
>
Received on Thursday, 9 August 2012 18:51:06 UTC