Re: Binary vs Text from Maciej Stachowiak on 2018-11-13 (public-gpu@w3.org from November 2018)

From: Maciej Stachowiak <mjs@apple.com>
Date: Mon, 12 Nov 2018 20:12:12 -0800
To: Josh Carpenter <joshcarpenter@google.com>
Cc: josh@joshgroves.com, Filip Pizlo <fpizlo@apple.com>, Jeff Gilbert <jgilbert@mozilla.com>, Kenneth Russell <kbr@google.com>, "Myles C. Maxfield" <mmaxfield@apple.com>, public-gpu@w3.org
Message-id: <D57B8818-4931-4B05-BAAF-244B9AC39D3E@apple.com>
> On Nov 12, 2018, at 5:48 PM, Josh Carpenter <joshcarpenter@google.com> wrote:
> 
> > Is there a way web authors could be polled to determine whether this is a concern that this group needs to address? This has been raised previously, but this ability is already heavily restricted by the output of many modern web toolchains. As others have stated before, source maps can still be used to include shader source code if the author prefers to do so.
> 
> If WebGL is any indication, the majority of developers and designers will be leveraging WebGPU through frameworks—ala Three.js, A-Frame, Babylon—and will never view the underlying source. In my day to day, WebGL = Three.js, for all intents and purposes. Assuming that's true, perhaps the question then shifts to, "what degree of readability (or other requirements) do the framework developers want?” 

Our charter <https://gpuweb.github.io/admin/cg-charter.html <https://gpuweb.github.io/admin/cg-charter.html>> says that these are our target audiences:

 • Developers of 3D and game engines/tools that are producing Web content via transpilation. For example, Unity and the Unreal Engine use emscripten to compile content for the Web.
 • JavaScript framework developers who are building GPU libraries, intended to be used in Web content, but providing a higher-level API and hiding much of the low-level graphics and compute details from their users. For example, three.js.
 • Web developers who are competent in GPU technologies, and will want to use the GPU for the Web API directly to create content, rather than a higher-level framework.


So we definitely care a lot about the framework case. Three.js in particular relies on runtime shader generation, which is obviously more practical if WebGPU accepts a text-based language.

Folks have suggested that alternate features could replace online compilation but it’s not clear if this will hit all the use cases.

Another suggestion is to provide a compiler from WHLSL (or another language) to WebSPIR-V that can run in the browser using either JavaScript or WebAssembly. To me, this latter option seems like an abdication of responsibility. Also likely not great for performance or startup time.

Regards,
Maciej


> 
> On Mon, Nov 12, 2018 at 5:40 PM Joshua Groves <josh@joshgroves.com <mailto:josh@joshgroves.com>> wrote:
> > If you consider that web folks will want a textual view-source for shaders, then SPIR-V may end up being significantly more annoying.
> 
> Is there a way web authors could be polled to determine whether this is a concern that this group needs to address? This has been raised previously, but this ability is already heavily restricted by the output of many modern web toolchains. As others have stated before, source maps can still be used to include shader source code if the author prefers to do so.
> 
> Josh
> 
> On Wed, Nov 7, 2018 at 8:46 AM Filip Pizlo <fpizlo@apple.com <mailto:fpizlo@apple.com>> wrote:
> 
> 
> > On Nov 6, 2018, at 6:55 PM, Jeff Gilbert <jgilbert@mozilla.com <mailto:jgilbert@mozilla.com>> wrote:
> > 
> > I don't think it's necessarily helpful to think of this discussion as
> > predominately binary vs text.
> > 
> > I think there is a lot of value in a constrained, targeted ingestion
> > format, *and separately* I think SPIR-V is a natural choice for this
> > ingestion format.
> 
> WHLSL is a constrained ingestion format that happens to be human readable. 
> 
> I think it’s a hard judgement call about whether SPIR-V or WHLSL are easier to ingest. If you consider that web folks will want a textual view-source for shaders, then SPIR-V may end up being significantly more annoying. Also, a lot depends on the as-yet-not-defined web security model for SPIR-V. 
> 
> > 
> > SPIR-V's core format is very, very easy to parse, and lends itself
> > well to simple but robust parsing. Lifetimes are clearly expressed,
> > instruction invocations are very explicit, and ecosystem support is
> > already good. It's a dream format for ingestion.
> > 
> > Binning it with other (particularly older) binary formats is just
> > inaccurate.
> 
> Binning WHLSL with other (particularly older) text formats is just inaccurate. 
> 
> > Doing the initial parse gives you the structures
> > (functions, types, bindings) you want pretty immediately. By
> > construction, most unsafe constructs are impossible or trivially
> > validatable. (SSA, instruction requirements, unsafe types, pointers)
> 
> WHLSL is safe by construction as well!  Even more so since there is nothing that WebGPU will have to explicitly exclude via a separate security model document. 
> 
> > 
> > For what it's worth, text formats are technically binary formats
> > with a charset. I would rather consume a constrained,
> > rigidly-structured (SSA-like? s-expressions?) text-based assembly
> > than some binary formats I've worked with. (DER, ugh!)
> > 
> > Disentangling our ingestion format from the pressures of both
> > redundancies and elisions that are desirable in directly-authored
> > languages, simplifies things and actually prevents ambiguity. It
> > immediately frees the authoring language to change and evolve at a
> > faster rate, and tolerates more experimentation.
> > 
> > I would rather solve the compilation tool distribution use-case
> > without sacrificing simplicity and robustness in ingestion. A
> > authoring-to-ingestion language compiler in a JS library would let us
> > trivially share everything above the web-IR->host-IR translation,
> > including optimization passes.
> >> On Tue, Nov 6, 2018 at 3:16 PM Ken Russell <kbr@google.com <mailto:kbr@google.com>> wrote:
> >> 
> >> Hi Myles,
> >> 
> >> Our viewpoint is based on the experience of using GLSL as WebGL's input language, and dealing with hundreds of bugs associated with parsing, validating, and passing a textual shading language through to underlying drivers.
> >> 
> >> Kai wrote this up at the beginning of the year in this Github issue: https://github.com/gpuweb/gpuweb/issues/44 <https://github.com/gpuweb/gpuweb/issues/44> , and there is a detailed bug list (which is still only a sampling of the associated bugs we fixed over the years) in this spreadsheet:
> >> https://docs.google.com/spreadsheets/d/1bjfZJcvGPI4M6Df5HC8BPQXbl847RpfsFKw6SI6_T30/edit#gid=0 <https://docs.google.com/spreadsheets/d/1bjfZJcvGPI4M6Df5HC8BPQXbl847RpfsFKw6SI6_T30/edit#gid=0>
> >> 
> >> Unlike what I said on the call, the main issues aren't really around the parsing of the input language or string handling. Both the preprocessor's and compiler's parsers in ANGLE's shader translator are autogenerated from grammars. Of more concern were situations where we had to semi-arbitrarily restrict the source language so that we wouldn't pass shaders through to the graphics driver which would crash its own shader compiler. Examples included having to restrict the "complexity" or "depth" of expression trees to avoid stack overflows in some drivers (this was added as an implementation-specific security workaround rather than to the spec), working around bugs in variable scoping and shadowing, defeating incorrect compiler optimizations, and more. Please take the time to read Kai's writeup and go through the spreadsheet.
> >> 
> >> The question will come up: would using a lower-level representation like SPIR-V for WebGPU's shaders really address these problems? I think it would. SPIR-V uses  SSA form and simple numbers for variables, which will eliminate entire classes of bugs in mishandling of language-level identifiers, variables, and scopes. SPIR-V's primitives are lower level than those in a textual shader language, and if it turns out restrictions on shaders are still needed in WebGPU's environment spec in order to work around driver bugs, they'll be easier to define more precisely against SPIR-V than source text. Using SPIR-V as WebGPU's shader ingestion format would bring other advantages, including that it's based on years of experience developing a portable binary shader representation, and has been designed in conjunction with GPU vendors across the industry.
> >> 
> >> On the conference call I didn't mean to over-generalize the topic to "binary formats vs. text formats in the browser", so apologies if I misspoke.
> >> 
> >> -Ken
> >> 
> >> 
> >> 
> >>> On Mon, Nov 5, 2018 at 10:58 PM Myles C. Maxfield <mmaxfield@apple.com <mailto:mmaxfield@apple.com>> wrote:
> >>> 
> >>> Hi!
> >>> 
> >>> When we were discussing WebGPU today, the issue of binary vs text was raised. We are confused at the viewpoint that binary languages on the Web are inherently safer and more portable than text ones. All of our browsers accept HTML, CSS, JavaScript, binary image formats, binary font files, GLSL, and WebAssembly, and so we don’t understand how our teams came to opposite conclusions given similar circumstances.
> >>> 
> >>> Can you describe the reasons for this viewpoint (as specifically as possible, preferably)? We’d like to better understand the reasoning.
> >>> 
> >>> Thanks,
> >>> Myles
> > 
> 
> 
> 
> -- 
> Josh Carpenter
> UX Lead for WebVR/AR, Daydream
> joshcarpenter@google.com <mailto:joshcarpenter@google.com>
Received on Tuesday, 13 November 2018 04:12:41 UTC