Re: Binary vs Text from James Darpinian on 2018-11-07 (public-gpu@w3.org from November 2018)

From: James Darpinian <jdarpinian@google.com>
Date: Wed, 7 Nov 2018 10:14:42 -0800
To: mjs@apple.com
Cc: Kenneth Russell <kbr@google.com>, "Myles C. Maxfield" <mmaxfield@apple.com>, public-gpu@w3.org
Message-ID: <CAORar-wQrHzTaLGgHSMBUFCNd5a4=+r5MA32Erh1ajzUNEPhpw@mail.gmail.com>
> JavaScript and WebAssembly have comparable interoperability and security.

Yes, let's take the example of JavaScript vs. WebAssembly as it's the most
similar thing in the web platform today to what we're debating.

JavaScript today enjoys a high level of compatibility, at least for
features older than a year or two. However, it wasn’t always that way. It
took decades to get here. In the early years of JavaScript portability
issues were common. We don’t have decades to get WebGPU right, nor can we
afford to spend the same amount of engineering resources on our shader
compilers as we do on our JavaScript implementations to achieve that
compatibility.

On the security front, vulnerabilities in JavaScript implementations are
common
<https://bugs.chromium.org/p/project-zero/issues/list?can=1&q=lokihardt+-%22WebKit%3A+UXSS%22&colspec=ID+Status+Restrict+Reported+Vendor+Product+Finder+Summary&num=200>.
Many can be blamed on the complexity of implementing an evolving
human-readable language, and the aggressive optimizations you have to do
when you can’t assume a compiler has already processed the code. We rely on
sandboxing to protect users when JavaScript implementations fail.
Unfortunately GPU drivers constrain how effectively we can sandbox WebGPU
implementations.

Now looking at WebAssembly, it also enjoys a high level of compatibility,
which was achieved in less time with less engineering effort compared to
JavaScript. I think it is too early to fully judge the security of
WebAssembly implementations, but while there have been vulnerabilities,
Project Zero has this to say
<https://googleprojectzero.blogspot.com/2018/08/the-problems-and-promise-of-webassembly.html>:
“compared to other recent browser features, surprisingly few
vulnerabilities have been reported in [WebAssembly]. This is likely due to
the simplicity of the current design [...]”.

Binary vs. text is probably not the right thing to focus on here. The
simplicity of WebAssembly compared to JavaScript, and SPIR-V compared to
HLSL, is not due to their use of a binary format. Of course, both have text
formats that we could use if there was any advantage to them. The
difference is that implementing a programming language with strong human
readability and writability constraints is more complex than implementing a
compiler intermediate language without those constraints.

On Tue, Nov 6, 2018 at 9:58 PM Maciej Stachowiak <mjs@apple.com> wrote:

>
>
> On Nov 6, 2018, at 3:15 PM, Ken Russell <kbr@google.com> wrote:
>
> Hi Myles,
>
> Our viewpoint is based on the experience of using GLSL as WebGL's input
> language, and dealing with hundreds of bugs associated with parsing,
> validating, and passing a textual shading language through to underlying
> drivers.
>
> Kai wrote this up at the beginning of the year in this Github issue:
> https://github.com/gpuweb/gpuweb/issues/44 , and there is a detailed bug
> list (which is still only a sampling of the associated bugs we fixed over
> the years) in this spreadsheet:
>
> https://docs.google.com/spreadsheets/d/1bjfZJcvGPI4M6Df5HC8BPQXbl847RpfsFKw6SI6_T30/edit#gid=0
>
> Unlike what I said on the call, the main issues aren't really around the
> parsing of the input language or string handling. Both the preprocessor's
> and compiler's parsers in ANGLE's shader translator are autogenerated from
> grammars. Of more concern were situations where we had to semi-arbitrarily
> restrict the source language so that we wouldn't pass shaders through to
> the graphics driver which would crash its own shader compiler. Examples
> included having to restrict the "complexity" or "depth" of expression
> trees
> <https://chromium.googlesource.com/angle/angle/+/eb1a010f0f996b3742fd34b92ffaf9014c943528>
> to avoid stack overflows in some drivers (this was added as an
> implementation-specific security workaround rather than to the spec),
> working around bugs in variable scoping and shadowing
> <https://chromium.googlesource.com/angle/angle/+/855d964bd0d05f6b2cb303f625506cf53d37e94f>,
> defeating incorrect compiler optimizations, and more. Please take the time
> to read Kai's writeup and go through the spreadsheet.
>
>
> It seems to me these issues aren’t about the language that is input to the
> WebGPU (or WebGL) implementation at all, but rather what it outputs and
> what drivers ingest.
>
> However, what languages we output (and what drivers then ingest) will be
> the same regardless of the source language. What the CG has been discussing
> is these two system designs:
>
> (A) [ WHLSL source ] —> [ WebGPU implementation ] —> [  One of: Vulkan
> SPIR-V, MSL, HLSL, or DXIL ] —> [ Driver ]
> (B) [ WebGPU SPIR-V source ] —> [ WebGPU implementation ] —> [ One of:
> Vulkan SPIR-V, MSL, HLSL, or DXIL ] —> [ Driver ]
>
>
> Regardless of the input language to the WebGPU implementation, the driver
> will be presented with the same possible range of languages, which include
> both text-based and binary languages. If the problem is not with parsing
> processing at the time of ingesting the source language, but rather the
> results of sending a text-based language to a driver, then this information
> has no impact on what we choose as a source language. Conflating the input
> to WebGPU with the input to the driver is an unsound argument.
>
> I feel like others on this thread have made the same mix-up, so let’s all
> try  to be clear about the difference between the input language, and what
> is fed to the driver.
>
>
> Note also that I distinguish “WebGPU SPIR-V” from “Vulkan SPIR-V”. We’ve
> been speaking of SPIR-V as if it’s a single language, but really it’s a
> language family with a bunch of parameters and variations that can be
> defined by an execution environment. The SPIR-V accepted by Vulkan is not
> the same as the set accepted by OpenCL, for example. Similarly, the initial
> informal document describing how SPIR-V could be handled safely on the web
> includes both validation steps (which impose restrictions by rejecting some
> shaders) and transforms (which translate some constructs into others, e.g..
> adding bounds checks in certain places). Hopefully this is already clear to
> everyone, but I wanted to emphasize the point.  It’s not clear to me at
> this time how this would affect already existing Vulkan SPIR-V drivers
> since the WebGPU dialect of SPIR-V is not yet fully defined.
>
>
>
>
> The question will come up: would using a lower-level representation like
> SPIR-V for WebGPU's shaders really address these problems? I think it
> would. SPIR-V uses  SSA form and simple numbers for variables, which will
> eliminate entire classes of bugs in mishandling of language-level
> identifiers, variables, and scopes.
>
>
> This seems like a very speculative hypothesis, and it does not match my
> intuition, when thinking about the system as a whole.
>
> These issues won’t affect implementations using Vulkan under the covers in
> any case, since they will be fed Vulkan SPIR-V and will not know about
> scopes in the source language, whatever it is. But they will affect a
> Metal-based implementation regardless, since WebGPU SPIR-V would need to be
> transformed to text-based MSL, in the process introducing identifiers,
> variables and scopes. It might be that the MSL looks like SSA form, which
> simplifies some issues. But a text-based input format like WHLSL can also
> be transformed into SSA-looking MSL, if that was for some reason more
> robust or more convenient.
>
> Now, in practice, Metal is unlikely to have the same kinds of issues as
> OpenGL drivers, since it decouples parsing from the individual driver. But
> if it did, then the result of compiling WHLSL to MSL can use all the same
> techniques as compiling WHLSL to SPIR-V, and then compiling that to MSL.
>
>
>
>
> SPIR-V's primitives are lower level than those in a textual shader
> language,
>
>
> SSA form can express basically all the same things that an AST can, so the
> lower levelness doesn’t really restrict what you can throw at, say, a Metal
> driver.
>
> and if it turns out restrictions on shaders are still needed in WebGPU's
> environment spec in order to work around driver bugs, they'll be easier to
> define more precisely against SPIR-V than source text.
>
>
> I’m not sure this is true. This would be affected more by the precision of
> the spec than the choice of wire format. And WHLSL’s draft specification is
> already much more precise than the SPIR-V spec, closer to the level
> WebAssembly is at.
>
>
> Using SPIR-V as WebGPU's shader ingestion format would bring other
> advantages, including that it's based on years of experience developing a
> portable binary shader representation, and has been designed in conjunction
> with GPU vendors across the industry.
>
>
> But it will also have significant disadvantages, like requiring the
> serving of large piles of JavaScript to do online compilation for those use
> cases that require it.
>
>
> On the conference call I didn't mean to over-generalize the topic to
> "binary formats vs. text formats in the browser", so apologies if I
> misspoke.
>
>
> It’s good that you agree the problems of GLSL don’t generalize to all text
> formats. For the record though, I would like to cite something I wrote
> offline, in case anyone does feel this way:
>
> -----------
> I notice you present GSLS as the only example where there was a series of
> problems like this. And you did not address Myles’s point that browsers
> ingest many text and binary formats, and at least in our experience, we do
> not find the binary formats to be more robust or more secure. There are
> three possible conclusions to draw from the experience with GLSL:
>
> (1) GLSL specifically had unique problems with robustness and security
> (due to driver design, insufficiently precise specification, etc).
> (2) All text-based shader languages for the web would have the same
> problem as GLSL, but binary shader languages for the web would not.
> (3) All text-based languages of any kind for the web would have the same
> problem as GLSL, but binary formats for the web would not.
>
> We have evidence that statement (3) is false: looking at the whole range
> of languages and formats for the web, binary formats are not historically
> more robust or more secure on the whole. Some  binary formats that used to
> be pervasive on the web, such as Java and SWF, have been abandoned
> precisely because their implementations are not robust. PDF requires
> sandboxing just as much as HTML. JavaScript and WebAssembly have comparable
> interoperability and security. Comparing heads-up examples from the same
> categories, there is no evidence that binary formats are more robust across
> the board.
>
> We have some evidence that at least statement (1) is true, based o your
> bug list above.
>
> So the remaining potentialpoint of dispute is statement (2). Based on
> GLSL, you extrapolate to all text-based shader languages. Based on other
> text and binary languages for the web, we extrapolate to text and binary
> shader languages. Why do you think (1) has more bearing on a new shader
> language than (3)? It seems to me that the difference between GLSL and more
> successful text based formats for the web is not due to being a shader
> language. Rather, it’s due to factors specific to GLSL, such as
> insufficiently precise specification, and use of a preprocessor. I don’t
> see why you would extrapolate to languages where these factors don’t apply.
> -----------
>
> It seems like, in the latest discussion, the cited problems aren’t with
> processing of a text language as input, but rather in feeding a text-based
> language to drivers that all have their own (potentially inconsistent and
> quirky) implementations. I’m glad! This means we don’t need to debate text
> vs binary in the abstract.
>
> But again, our choice of source language has absolutely no effect on
> whether we face these problems; we still have to work with the same drivers.
>
>
>
>
> -Ken
>
>
>
> On Mon, Nov 5, 2018 at 10:58 PM Myles C. Maxfield <mmaxfield@apple.com>
> wrote:
>
>> Hi!
>>
>> When we were discussing WebGPU today, the issue of binary vs text was
>> raised. We are confused at the viewpoint that binary languages on the Web
>> are inherently safer and more portable than text ones. All of our browsers
>> accept HTML, CSS, JavaScript, binary image formats, binary font files,
>> GLSL, and WebAssembly, and so we don’t understand how our teams came to
>> opposite conclusions given similar circumstances.
>>
>> Can you describe the reasons for this viewpoint (as specifically as
>> possible, preferably)? We’d like to better understand the reasoning.
>>
>> Thanks,
>> Myles
>>
>
>
Received on Wednesday, 7 November 2018 18:17:34 UTC