Re: Specificity in the Web Audio API spec from Jussi Kalliokoski on 2012-03-30 (public-audio@w3.org from January to March 2012)

From: Jussi Kalliokoski <jussi.kalliokoski@gmail.com>
Date: Fri, 30 Mar 2012 16:37:29 +0300
To: "Wei, James" <james.wei@intel.com>
Cc: "public-audio@w3.org" <public-audio@w3.org>
Message-ID: <CAJhzemXAAXqXrxHCR2BzU5dC4aAVy1hBhyny4=-kBu7UiDFFzQ@mail.gmail.com>
> I think you raise some interesting points.  What is the goal here?  Are
you expecting that independent implementations will always produce
*exactly* the same output for the same input?

Yes, that would be quite ideal. Otherwise if you need that precision (a DAW
hardly can afford to sound different on differnet platforms, especially on
such a crucial element as a delay node), you're going to have to exclude
browsers or resort to a JavaScript implementation for a tool that's
supposed to be predefined. Kind of beats the purpose of having predefined
nodes, I think. And having these algorithms well defined in the spec is
something to push browser vendors to fix their implementations instead of
marking them as WontFix because it follows the spec that isn't defined well
enough.

> I don't think the spec is intended to give a bit-exact implementation
across all vendors.  I could be wrong though; Chris will have the
definitive answer.

Yes, I'd be interested to hear what he thinks. We've had prior discussions
about this, and it seemed to me that we were mostly in consensus that it's
best if all implementations produce the same results. Pipe dream? Yes, very
much, but I think we should do our best to help browser vendors make
consistent implementations to keep the end developers from having to worry
about the inconsistencies. :)

> For your resampling issue, I think that would be a quality of
implementation issue.  A good implementation will do a good job and a bad
implementation will do a not so good job. This allows different vendors to
"compete".  (That's my view point, coming from the cellular industry where
many things are vaguely specified and you have to work hard to figure out
how to make it work.  Perhaps audio is different.)

Perhaps the implementations can do a different job at it, but if we are
going to allow that, then it might be a good idea to make the
implementation expose some information of what it does (is it using ZOH or
linear interpolation, or a sinc filter, and if so, with which parameters)
to help the developer react to the situation with different filter
settings, etc. Might be catering for a very small audience that cares,
though.

Cheers,
Jussi



On Fri, Mar 30, 2012 at 4:51 AM, Wei, James <james.wei@intel.com> wrote:

>  (IIRC a certain Intel processor had a bug in its FFT implementation,
> having severe implications. Sorry I'm stating things without providing any
> valid references, maybe James Wei has more specifics on the incident :])**
> **
>
> ** **
>
> I don’t know any FFT bug in Intel processor. There was a fdiv bug in
> Pentium long long ago (http://en.wikipedia.org/wiki/Pentium_FDIV_bug ),
> but it is a bug, not something with standard
>

Oops, sorry, yes, s/FFT/fdiv, I was just reading the FFT discussion we had
earlier, heh! Yes, the standard wasn't at fault, but still a prove that
something that you'd think is an industry standard is easy to get wrong.

****
>
> ** **
>
> Best Regards ****
>
> ** **
>
> James ****
>
> ** **
>
> ** **
>
> *From:* Jussi Kalliokoski [mailto:jussi.kalliokoski@gmail.com]
> *Sent:* Thursday, March 29, 2012 6:33 PM
> *To:* public-audio@w3.org
> *Subject:* Specificity in the Web Audio API spec****
>
> ** **
>
> Hello group!
>
> Now that we have already published the second working draft, I think it
> might be worthwhile starting to clarify some bits in the spec. This was
> ambiguity of a large API like this was already discussed some time ago, but
> it's more relevant now, imho, since we're starting to expect prototype
> implementations from people other than the author of the spec. Specifying
> audio tools is hard, especially if you want to avoid specifying the exact
> implementation rather than the algorithm, but this needs to be done for the
> implementations to be consistent. We can't trust that browser vendors get
> "industry standard" algorithms right, especially if the "standard" is
> vague. Even well defined algorithms such as FFT have gone wrong in the past
> (IIRC a certain Intel processor had a bug in its FFT implementation, having
> severe implications. Sorry I'm stating things without providing any valid
> references, maybe James Wei has more specifics on the incident :]). Another
> example is that Math.sqrt gives incorrect results on Chrome on Windows, and
> they refuse to fix it because "it's according to the spec" [1].
>
> To demonstrate, let's start with the DelayNode (yes, even something as
> simple as this has a lot of ambiguity):
>
> 1) The time is given in seconds. How is this rounded to samples when
> adjusting in terms of the delay buffer? Should it be nearest neighbor?
> Floored? Or is the buffer fixed size with an adjustable playback rate,
> resampling on the fly? If so, what resampling method should be used?
> 2) "When the delay time is changed, the implementation must make the
> transition smoothly, without introducing noticeable clicks or glitches to
> the audio stream." This part is very vague, what does it mean? Basically,
> you could have the old buffer fade out and new buffer introduced with a
> transition, and it would be according to the spec. Not very desirable. On
> the other hand, the browser could be using exactly the algorithm you had
> planned for it, but if the source data contains clicks, it could be
> interpreted as a bug. So, what should be done to make the transition
> smooth? Should the implementation use a fixed size buffer like I explained
> in the first point?
>
> Cheers,
> Jussi
>
> [1] http://code.google.com/p/chromium/issues/detail?id=117699****
>
Received on Friday, 30 March 2012 13:38:01 UTC