Re: Aligning with ONNX (from minutes of 5 Sep 2019 call)

Hi Dean,

I'm not sure how clear things were in the meeting notes, or to others who
attended the meeting itself. Let me try to summarize. There are multiple
teams at Google that have been involved, including in TensorFlow and
Chrome. So this is my personal summary, from conversations across the org.

Our proposed path forward is to start by identifying a small number of
WebGL/WebGPU extensions that will provide significant performance gains not
already available, without tying to broader operation or graph
standardization efforts. We believe it should be possible to bring real
performance improvement to the web sooner by starting small. Because these
low level APIs would be focused, it would not be necessary to make a larger
commitment to an external standard with the scope that ONNX has. And these
APIs could ship relatively soon.

Our concerns are partly about getting too ambitious too soon. After
extensive experience with graphs and operations in TensorFlow, as well as
multiple iterations of the NN API (which is a graph API and an inspiration
for WebML), we have some doubts that these efforts will lead to a stable
API suitable for the web, or mobile or desktop use, in the next couple of
years. Within Chrome, we've seen similar challenges with Web Audio, which
also includes a graph API that is vastly simpler, in a field that isn't
evolving as quickly. Standardization at this intermediate level of
abstraction is difficult. There's a strong preference in Chrome to focus on
the lowest level standards.

Currently the direction that the TensorFlow team is pursuing internally for
performance optimization, in collaboration with hardware vendors, is MLIR,
a multi level intermediate representation based on LLVM. It's definitely
premature to think of MLIR as a web standard though. We're all learning
about this dynamic space.

Perhaps in a couple years, after the first low level APIs have shipped,
we'll be more confident in standardizing on graphs. Or perhaps tensor
comprehensions or an IR will look like the right solution for the web.

We believe that the explorations of the community group at the level of
graphs and operation sets are really valuable for learning and identifying
approaches to things like shared memory and custom operations, which will
help in future standards efforts. We don't want to stop the great work, and
we intend to continue to participate.

I know I haven't said much about ONNX. That's intentional. Hopefully it's a
little clearer now where we're coming from. We believe it's premature, for
Google and for the Web, to standardize on operation sets or graphs. We very
much want to find a way to bring ML to the web, with hardware acceleration.
We see value in starting small and continuing to explore some of the more
ambitious ideas, without yet concluding they're the path forward.

Make sense? Feel free to ask if anything is still unclear. Also happy to
talk individually.

Cheers,
Jonathan

On Sun, Sep 8, 2019, 1:59 PM Dean Jackson <dino@apple.com> wrote:

> Hi,
>
> Firstly, apologies that neither Ben nor I can make the teleconferences, so
> were unable to say this in person.
>
> We noticed that there was discussion about not aligning with ONNX on the
> most recent call. This was slightly surprising since we (Apple) assumed
> that the decision in https://github.com/webmachinelearning/webnn/issues/17 was
> a resolution.
>
> While we didn't comment there, we would prefer to align with ONNX at the
> moment. Can we stick with this resolution for a while before investigating
> alternatives? What is the driving need for change right now? Unless I'm
> mistaken, the decision was to start with a small subset of ONNX and then
> see how compatible it is with JS frameworks. Is there new information?
>
> As Rafael pointed out in the meeting, ONNX has the advantage of being
> neutral (although there was a question about its neutrality, which I don't
> understand).
>
> Dean
>
>
>
>

Received on Monday, 9 September 2019 03:51:16 UTC