Re: Aligning with ONNX (from minutes of 5 Sep 2019 call) from Jonathan Bingham on 2019-09-10 (public-webmachinelearning@w3.org from September 2019)

From: Jonathan Bingham <binghamj@google.com>
Date: Tue, 10 Sep 2019 16:32:45 -0700
To: Benjamin Poulain <bpoulain@apple.com>, Greg Whitworth <gwhit@microsoft.com>
Cc: public-webmachinelearning@w3.org, Dean Jackson <dino@apple.com>
Message-ID: <CAEK6eFyqdQJvPz9LrsN9p9wrttqSqqKThGCFArxgKuUaYfcV8Q@mail.gmail.com>
Hi Benjamin,

Definitely agree that these low level APIs are not going to provide access
to the full performance benefits available to CoreML or NN API. And also
agree that we *want* a way for web apps to access the full power available
to native apps.

A few WebGL style ops could be a significant boost compared to today, and
should be easier to ship in the web platform in the near to medium term.
ONNX has over a hundred ops. PyTorch has over 200. TensorFlow has many more
than that. That's a pretty big API surface to standardize, and the op
definitions keep changing too. Maybe getting a couple of ops sorted out and
shipped is a prerequisite to getting more ambitious, and actually shipping
something sooner would be a good start?

Question for you: Is a graph API your ideal level of abstraction? Or would
you prefer a model loader and inference API, as has also been proposed in
the group? I was just talking with +Greg Whitworth <gwhit@microsoft.com> about
the different levels again.

Cheers,
Jonathan


On Tue, Sep 10, 2019 at 4:17 PM Benjamin Poulain <bpoulain@apple.com> wrote:

> Hi Jonathan,
>
> One of the problems with GPU extensions is they would only enable one
> narrow type of acceleration.
>
> iPhone and iPad have a wide range a ML capabilities built in the hardware.
> For example, in addition to the GPU the latest iPhone has a 3rd generation
> Neural-Engine and a new accelerator. Both of those provide significant
> advantages over the GPU depending on the neural networks being run.
>
> Native apps take advantage of those accelerators through CoreML.
> I believe it would be beneficial for the web to expose at least some of
> those capabilities through WebNN.
>
> Benjamin
>
> Il giorno 8 set 2019, alle ore 8:50 PM, Jonathan Bingham <
> binghamj@google.com> ha scritto:
>
> Hi Dean,
>
> I'm not sure how clear things were in the meeting notes, or to others who
> attended the meeting itself. Let me try to summarize. There are multiple
> teams at Google that have been involved, including in TensorFlow and
> Chrome. So this is my personal summary, from conversations across the org.
>
> Our proposed path forward is to start by identifying a small number of
> WebGL/WebGPU extensions that will provide significant performance gains not
> already available, without tying to broader operation or graph
> standardization efforts. We believe it should be possible to bring real
> performance improvement to the web sooner by starting small. Because these
> low level APIs would be focused, it would not be necessary to make a larger
> commitment to an external standard with the scope that ONNX has. And these
> APIs could ship relatively soon.
>
> Our concerns are partly about getting too ambitious too soon. After
> extensive experience with graphs and operations in TensorFlow, as well as
> multiple iterations of the NN API (which is a graph API and an inspiration
> for WebML), we have some doubts that these efforts will lead to a stable
> API suitable for the web, or mobile or desktop use, in the next couple of
> years. Within Chrome, we've seen similar challenges with Web Audio, which
> also includes a graph API that is vastly simpler, in a field that isn't
> evolving as quickly. Standardization at this intermediate level of
> abstraction is difficult. There's a strong preference in Chrome to focus on
> the lowest level standards.
>
> Currently the direction that the TensorFlow team is pursuing internally
> for performance optimization, in collaboration with hardware vendors, is
> MLIR, a multi level intermediate representation based on LLVM. It's
> definitely premature to think of MLIR as a web standard though. We're all
> learning about this dynamic space.
>
> Perhaps in a couple years, after the first low level APIs have shipped,
> we'll be more confident in standardizing on graphs. Or perhaps tensor
> comprehensions or an IR will look like the right solution for the web.
>
> We believe that the explorations of the community group at the level of
> graphs and operation sets are really valuable for learning and identifying
> approaches to things like shared memory and custom operations, which will
> help in future standards efforts. We don't want to stop the great work, and
> we intend to continue to participate.
>
> I know I haven't said much about ONNX. That's intentional. Hopefully it's
> a little clearer now where we're coming from. We believe it's premature,
> for Google and for the Web, to standardize on operation sets or graphs. We
> very much want to find a way to bring ML to the web, with hardware
> acceleration. We see value in starting small and continuing to explore some
> of the more ambitious ideas, without yet concluding they're the path
> forward.
>
> Make sense? Feel free to ask if anything is still unclear. Also happy to
> talk individually.
>
> Cheers,
> Jonathan
>
> On Sun, Sep 8, 2019, 1:59 PM Dean Jackson <dino@apple.com> wrote:
>
>> Hi,
>>
>> Firstly, apologies that neither Ben nor I can make the teleconferences,
>> so were unable to say this in person.
>>
>> We noticed that there was discussion about not aligning with ONNX on the
>> most recent call. This was slightly surprising since we (Apple) assumed
>> that the decision in
>> https://github.com/webmachinelearning/webnn/issues/17 was a resolution.
>>
>> While we didn't comment there, we would prefer to align with ONNX at the
>> moment. Can we stick with this resolution for a while before investigating
>> alternatives? What is the driving need for change right now? Unless I'm
>> mistaken, the decision was to start with a small subset of ONNX and then
>> see how compatible it is with JS frameworks. Is there new information?
>>
>> As Rafael pointed out in the meeting, ONNX has the advantage of being
>> neutral (although there was a question about its neutrality, which I don't
>> understand).
>>
>> Dean
>>
>>
>>
>>
>
Received on Tuesday, 10 September 2019 23:33:20 UTC