Re: Aligning with ONNX (from minutes of 5 Sep 2019 call) from James Darpinian on 2019-09-11 (public-webmachinelearning@w3.org from September 2019)

From: James Darpinian <jdarpinian@google.com>
Date: Wed, 11 Sep 2019 15:54:37 -0700
To: Benjamin Poulain <bpoulain@apple.com>
Cc: Jonathan Bingham <binghamj@google.com>, public-webmachinelearning@w3.org, Dean Jackson <dino@apple.com>
Message-ID: <CAORar-xOERgMC_yx_OdBk15c9=N0YF1Yh28b+LjPVfiHPmoRAQ@mail.gmail.com>
Yes, I agree that we will want an independent API in the future. We believe
that the field is evolving too quickly for us to pick a winner and set an
API in stone right now. We see WebGL/WebGPU extensions as a way to ship
something small, sooner, with low risk.

Meanwhile we will continue to observe the development of tools and
standards such as Core ML, ONNX, DirectML, Android NNAPI, MLIR, NNVM/TVM,
Tensor Comprehensions, etc. In the future one or more of those will likely
be a good platform to base a Web ML API standard on, and once we ship a
standalone Web ML API we will have the flexibility to remove the
WebGL/WebGPU extensions if we want.

On Wed, Sep 11, 2019 at 2:57 PM Benjamin Poulain <bpoulain@apple.com> wrote:

>
> Il giorno 10 set 2019, alle ore 5:32 PM, James Darpinian <
> jdarpinian@google.com> ha scritto:
>
> Totally agreed that dedicated ML hardware is very compelling and we need
> to think about exposing it. A WebGL/WebGPU based API could still support
> non-GPU accelerators. We could definitely expose optimized CPU ML kernels
> (e.g. Apple BNNS) through SwiftShader. For ML accelerators as long as they
> support sharing memory with the GPU or CPU and running ops without too much
> setup overhead we should be able to support them. I think Android NNAPI
> would work for this but I haven't investigated Core ML yet.
>
>
> I think there is a lot of value in exposing ML capabilities independently
> from the GPU APIs.
>
> Benjamin
>
> On Tue, Sep 10, 2019 at 4:19 PM Benjamin Poulain <bpoulain@apple.com>
> wrote:
>
>> Hi Jonathan,
>>
>> One of the problems with GPU extensions is they would only enable one
>> narrow type of acceleration.
>>
>> iPhone and iPad have a wide range a ML capabilities built in the hardware.
>> For example, in addition to the GPU the latest iPhone has a 3rd
>> generation Neural-Engine and a new accelerator. Both of those provide
>> significant advantages over the GPU depending on the neural networks being
>> run.
>>
>> Native apps take advantage of those accelerators through CoreML.
>> I believe it would be beneficial for the web to expose at least some of
>> those capabilities through WebNN.
>>
>> Benjamin
>>
>> Il giorno 8 set 2019, alle ore 8:50 PM, Jonathan Bingham <
>> binghamj@google.com> ha scritto:
>>
>> Hi Dean,
>>
>> I'm not sure how clear things were in the meeting notes, or to others who
>> attended the meeting itself. Let me try to summarize. There are multiple
>> teams at Google that have been involved, including in TensorFlow and
>> Chrome. So this is my personal summary, from conversations across the org.
>>
>> Our proposed path forward is to start by identifying a small number of
>> WebGL/WebGPU extensions that will provide significant performance gains not
>> already available, without tying to broader operation or graph
>> standardization efforts. We believe it should be possible to bring real
>> performance improvement to the web sooner by starting small. Because these
>> low level APIs would be focused, it would not be necessary to make a larger
>> commitment to an external standard with the scope that ONNX has. And these
>> APIs could ship relatively soon.
>>
>> Our concerns are partly about getting too ambitious too soon. After
>> extensive experience with graphs and operations in TensorFlow, as well as
>> multiple iterations of the NN API (which is a graph API and an inspiration
>> for WebML), we have some doubts that these efforts will lead to a stable
>> API suitable for the web, or mobile or desktop use, in the next couple of
>> years. Within Chrome, we've seen similar challenges with Web Audio, which
>> also includes a graph API that is vastly simpler, in a field that isn't
>> evolving as quickly. Standardization at this intermediate level of
>> abstraction is difficult. There's a strong preference in Chrome to focus on
>> the lowest level standards.
>>
>> Currently the direction that the TensorFlow team is pursuing internally
>> for performance optimization, in collaboration with hardware vendors, is
>> MLIR, a multi level intermediate representation based on LLVM. It's
>> definitely premature to think of MLIR as a web standard though. We're all
>> learning about this dynamic space.
>>
>> Perhaps in a couple years, after the first low level APIs have shipped,
>> we'll be more confident in standardizing on graphs. Or perhaps tensor
>> comprehensions or an IR will look like the right solution for the web.
>>
>> We believe that the explorations of the community group at the level of
>> graphs and operation sets are really valuable for learning and identifying
>> approaches to things like shared memory and custom operations, which will
>> help in future standards efforts. We don't want to stop the great work, and
>> we intend to continue to participate.
>>
>> I know I haven't said much about ONNX. That's intentional. Hopefully it's
>> a little clearer now where we're coming from. We believe it's premature,
>> for Google and for the Web, to standardize on operation sets or graphs. We
>> very much want to find a way to bring ML to the web, with hardware
>> acceleration. We see value in starting small and continuing to explore some
>> of the more ambitious ideas, without yet concluding they're the path
>> forward.
>>
>> Make sense? Feel free to ask if anything is still unclear. Also happy to
>> talk individually.
>>
>> Cheers,
>> Jonathan
>>
>> On Sun, Sep 8, 2019, 1:59 PM Dean Jackson <dino@apple.com> wrote:
>>
>>> Hi,
>>>
>>> Firstly, apologies that neither Ben nor I can make the teleconferences,
>>> so were unable to say this in person.
>>>
>>> We noticed that there was discussion about not aligning with ONNX on the
>>> most recent call. This was slightly surprising since we (Apple) assumed
>>> that the decision in
>>> https://github.com/webmachinelearning/webnn/issues/17 was a resolution.
>>>
>>> While we didn't comment there, we would prefer to align with ONNX at the
>>> moment. Can we stick with this resolution for a while before investigating
>>> alternatives? What is the driving need for change right now? Unless I'm
>>> mistaken, the decision was to start with a small subset of ONNX and then
>>> see how compatible it is with JS frameworks. Is there new information?
>>>
>>> As Rafael pointed out in the meeting, ONNX has the advantage of being
>>> neutral (although there was a question about its neutrality, which I don't
>>> understand).
>>>
>>> Dean
>>>
>>>
>>>
>>>
>>
>
Received on Wednesday, 11 September 2019 22:55:12 UTC