Re: Issues with ROC's proposal from Marcus Geelnard on 2013-09-04 (public-audio@w3.org from July to September 2013)

From: Marcus Geelnard <mage@opera.com>
Date: Wed, 04 Sep 2013 10:11:32 +0200
To: robert@ocallahan.org
CC: WebAudio <public-audio@w3.org>
Message-ID: <5226EB34.6080505@opera.com>
Thanks for your response (Chris too :) )!

2013-09-03 23:54, Robert O'Callahan skrev:
> On Mon, Sep 2, 2013 at 9:56 PM, Marcus Geelnard <mage@opera.com 
> <mailto:mage@opera.com>> wrote:
>
>     _The main things that bother me about the proposal are_:
>
>
>     1) The naming/functionality of getChannelData().
>
>     If we disregard the case of the AudioProcessingEvent (see below),
>     the main purpose (at least a very important function) of
>     getChannelData() seems to be to *set* the channel data.
>
>     We now have the copyChannelDataTo() method for getting a
>     persistent version of the channel data, while the getChannelData()
>     is used for getting a volatile (sometimes persistent!) version of
>     the channel data that may later be transfered back to the AudioBuffer.
>
>     IMO this is confusing (i.e. "get" ~= set/modify, "copy" == get),
>     to say the least.
>
>
> I see your point, but the idea of "getting a buffer" which you can 
> then read or write seems reasonably logical to me. Maybe we could 
> change the name to eliminate the illogical interpretation, but then 
> you lose compatibility. Doesn't seem worth it to me.

So I guess we agree that the main advantage of keeping the name of 
getChannelData() is backwards compatibility, and since that's a key 
design choice for this suggestion I can't really argue with that ;) Just 
wanted to point out a few oddities with the interface (though they might 
be obvious).

>
>     2) The getChannelData() method can return a persistent copy.
>
>     If you call getChannelData() on an AudioBuffer that is in use, you
>     will get a persistent copy of the AudioBuffer data (i.e. modifying
>     the data does nothing to the AudioBuffer, and the array will never
>     be neutered), which kind of goes against the purpose of
>     getChannelData(). Again, I find this quite confusing.
>
>     I think that a better solution would be to throw in that case (or
>     possibly return an empty array).
>
>
> I just disagree with this point. If you just want to read the buffer, 
> then having getChannelData always work instead of sometimes work is a 
> clear win. If you want to write to the buffer, the idea that 
> modifications will affect all future uses of the buffer and not any 
> uses currently running makes sense to me.

Maybe I've misunderstood the semantics? My impression was that sometimes 
modifications to an array returned by getChannelData() will modify the 
AudioBuffer, and sometimes it won't (ever). Or are you saying that if 
you getChannelData() on an "in use" AudioBuffer you'll get a copy of it, 
and if the AudioBuffer later goes inactive and then yet later is 
activated again, it would pick up the modifications to the array (and 
the array would be neutered at that point)?

If the latter is true, then I think it's logical.

>
>     3) The AudioProcessingEvent still uses AudioBuffers.
>
>     I realize that not all agree that this is a problem, but this
>     complicates the semantics of the AudioBuffer (among other things).
>     For instance, in an AudioProcessingEvent getChannelData() is used
>     both for getting the input channel data and for setting the output
>     channel data.
>
>     IMO the better solution would be to pass pure Float32Arrays in the
>     event handler instead (using neutering and ownership transfer as
>     necessary).
>
>
> Could be done. I doubt it's worth breaking compatibility however.

I agree - keeping the AudioBuffers in the interface is the only option 
for staying compatible. As I suggested, we could remedy this at a later 
time when we're designing the interfaces for worker based processor nodes.

>     4) Data transfer from JS to an AudioBuffer is implicit.
>
>     The data transfer from JS to an AudioBuffer is implicit by design,
>     rather than explicit. This is confusing, and could lead to
>     hard-to-find bugs.
>
>
> I'm not sure what you mean here.

My point here is that since the data transfer is implemented by a 
neutering operation, which is carried out at a few different API 
calls/operations (e.g. AudioBufferSourceNode.start()), it may not be 
100% clear to the developer when and why the array disappears. For 
instance, consider the following:

convolutionNode.buffer = buffer;
var a = buffer.getChannelData(0);
[1: write/modify a]
[2: analyze a]
convolutionNode.connect(context.destination);
[3: do some other stuff]

Now, if you refactor the code so that you move stuff between 2 & 3 for 
instance, or if you comment out parts of the code (e.g. the connect() 
call) in order to debug or test your code, you may end up having very 
different results (this is in fact a regression compared to the original 
shared memory design). If we required an explicit setChannelData() call, 
I think the situation would be much clearer to a developer (but maybe 
that's just me).

>
>     In general it's also sub-optimal from a performance perspective,
>     since it's easier to design a performance critical application if
>     you can limit possible performance hits to explicit points in your
>     code (e.g. let them happen during pre-processing/loading stages
>     rather than during playback stages). Now, since the proposal
>     relies heavily on neutering this might not be much of an issue,
>     but I still think that it's a good idea to at least *consider* an
>     implementation that does "acquire the contents" using a copy
>     operation.
>
>
> You mean you'd like "acquire the contents" to be an explicit operation 
> that could be executed before AudioBufferSourceNode.start(), right?
>
> There are a few ways we can improve the situation here:
> 0) We could add an explicit API to pre-acquire-the-contents in 
> AudioBufferSourceNode. I don't like this option since authors would 
> probably not use it, since it would have practically no effect in most 
> implementations.

I agree (unless the explicit "acquire-the-contents" operation was 
mandatory and the only way to update the AudioBuffer - but again, that 
would break compatibility).

> 1) I had originally specified that assigning to 
> AudioBufferSourceNode.buffer would acquire the contents of the buffer. 
> We could go back to that, and it would help with the above issue since 
> it could be performed during preprocessing/loading. It may be slightly 
> less compatible but I suspect we'd be OK.

Yes, it would probably help. We could do the same thing with the 
ConvolutionNode.buffer assignment. I can't really tell how much app 
breakage we would get though...

> 2) The UA can make AudioContext.decodeBufferData put the new 
> AudioBuffer in the arrays-neutered state (in the terminology of 
> https://wiki.mozilla.org/User:Roc/AudioBufferProposal#Implementation_Sketch). 
> So if it can't share memory, the buffer data will be in the right 
> place as long as the app doesn't call getChannelData on that AudioBuffer.

Did you mean AudioContext.decodeAudioData? I think it would only be 
logical for it to start in the arrays-neutered state since the most 
likely use case for such an AudioBuffer is to never be modified. I think 
you could argue that if you really want to access the decoded audio data 
from JS, you're very likely doing off-line processing anyway.

> 3) We could introduce a new constructor for AudioBuffer that takes a 
> sequence of Blobs/ArrayBuffers for the channels, neutering the 
> ArrayBuffers to steal their data. This constructor could put the new 
> ArrayBuffer in the arrays-neutered state similar to point 2.
>
> We should probably do #3 anyway since it's more efficient than 
> existing APIs if you already have Blobs or ArrayBuffers containing 
> samples (or if you want to use Blobs to share read-only data between 
> workers or domains). If we do #3 and the UA does #2 then maybe #1 
> wouldn't matter.
>

I agree. I seem to remember that most of us were in favor of something 
3:ish, which is also part of Jer's proposal 
(https://gist.github.com/jernoble/6034137#audiobuffer). In general, I'm 
all for solutions that don't prioritize the usage of getChannelData().

/Marcus

> Rob
> -- 
> Jtehsauts  tshaei dS,o n" Wohfy  Mdaon  yhoaus eanuttehrotraiitny  
> eovni le atrhtohu gthot sf oirng iyvoeu rs ihnesa.r"t sS?o  Whhei csha 
> iids  teoa stiheer :p atroa lsyazye,d  'mYaonu,r  "sGients  uapr,e  
> tfaokreg iyvoeunr, 'm aotr  atnod  sgaoy ,h o'mGee.t"  uTph eann dt 
> hwea lmka'n?  gBoutt  uIp  waanndt  wyeonut  thoo mken.o w *
> *


-- 
Marcus Geelnard
Technical Lead, Mobile Infrastructure
Opera Software
Received on Wednesday, 4 September 2013 08:12:27 UTC