Re: Uniformity analysis for the WebGPU Shading Language. from Robin Morisset on 2019-07-11 (public-gpu@w3.org from July 2019)

From: Robin Morisset <rmorisset@apple.com>
Date: Thu, 11 Jul 2019 15:28:30 -0700
To: Ken Russell <kbr@google.com>
Cc: David Neto <dneto@google.com>, Doug Moen <doug@moens.org>, public-gpu <public-gpu@w3.org>
Message-id: <60895CB4-B862-4AC3-BF0B-121F0E19D1FC@apple.com>
On Jul 11, 2019, at 14:55, Ken Russell <kbr@google.com> wrote:
> 
> Very nice work Robin formalizing these rules.

Thanks

> If a shader passes these constraints, is it likely that all underlying shader compilers and/or SPIR-V drivers will accept them, too? (After translation from WHLSL to the lower representation, of course.) fxc in particular apparently does some very complex uniformity analysis. I assume that it would only potentially accept some shaders that this analysis would reject, and not the other way around?

I would hope so, but have not tested it so far. In particular I have been told that the uniformity analysis from fxc is both complex and poorly documented, making any proof that it will accept any shader that passes these constraints pretty difficult.

> 
> -Ken
> 
> 
> On Tue, Jul 9, 2019 at 3:12 PM David Neto <dneto@google.com <mailto:dneto@google.com>> wrote:
> 
> The Uniform and UniformId decorations are for the programmer to assert (promise) that certain values are actually uniform when it is infeasible or unreasonable for the receiving implementation to do so.
> 
> Robin's analysis is a low-complexity (highly feasible) analysis.  It is intentionally conservative, which is the direction the group had agreed to.
> 
> So really they are for different usage patterns:  
> - General SPIR-V (e.g. for Vulkan) has the decorations for the scenario where the app is highly trusted by the driver.  It favours high expressibility.
> - WebGPU uniformity analysis is meant for low-trust or "big guardrails" execution.   This favours super-consistent behaviours across implementations, at the expense of what you are allowed to express.
> 
> 
> In any case, in Robin's analysis verification and inference have the same complexity. (I'd bet beers that they are exactly the same big-Oh complexity)

I have a proof that it is indeed linear-time, and inference can hardly be faster than that (since it needs to read the program).
So the exact same Big-O complexity, although my analysis probably has a somewhat higher constant.

> 
> cheers,
> david
> 
> 
> 
> 
> On Tue, Jul 9, 2019 at 8:45 AM Doug Moen <doug@moens.org <mailto:doug@moens.org>> wrote:
> Robin said: "Everyone: does this kind of analysis sound acceptable? From having tried, I doubt a uniformity inference analysis can be made much simpler and remain usable. The only other option I could see would be adding a ton of uniformity annotations to the language, and then having the browser only verify them instead of trying to infer them."
> 
> Also, Robin's uniformity inference code "should be adaptable to WebSPIRV with a moderate amount of work if they require structured control flow".
> 
> It seems to me that the SPIR-V ecosystem already has an established way of doing this. SPIR-V has Uniform and UniformId decorations (section 3.20 of SPIR-V spec), and the SPIR-V Validator tool checks that decorations are correctly applied. SPIR-V supports multiple execution environments: GLSL450, OpenCL and Vulkan, with different validity checks for different environments.
> 
> If the decision is made to standardize on WebSPIR-V instead of WHLSL, does it make sense to require uniformity decorations, and then have the browser only verify them instead of trying to infer them? The SPIR-V Validator tool could be extended to enforce correct usage of Uniform and UniformId decorations for the WebGPU execution environment.
> 
> I'm not really a SPIR-V expert, so I'm asking if this makes sense.
> 
> Doug Moen.
> 
> On Fri, Jul 5, 2019, at 8:51 PM, Robin Morisset wrote:
>> (Sending this email again, as it appears to have hit some error the previous time; if you already received it on Monday, it has not changed one bit).
>> 
>> Hello,
>> 
>> I finally finished writing a new version of the uniformity inference algorithm.
>> It is presented with both formal rules and pseudo-code that roughly follows C++ syntax, and should be understandable and implementable by only looking at the pseudo-code.
>> 
>> Quick reminder: the goal of this is to be able to say “this control-barrier or derivative is in uniform control flow so it is well-defined”, or “this control-barrier or derivative is in possibly non-uniform control-flow so the program should be rejected” (as we cannot ensure that the underlying platform only does safe things in that case).
>> 
>> Compared with the old uniformity analysis I added last fall to WHLSL and removed in the winter, this one has several strengths:
>> - Supports function calls, without an exponential complexity explosion
>> - Should be linear-time
>> - Has none of the non-determinism that was inherent in the previous analysis, making the implementation strategy a lot more explicit
>> - The primary presentation is pseudo-code which should hopefully be more readable to the committee.
>> - It is presented as its own validation pass, instead of being interleaved with the type checking of WHLSL.
>> - Makes it fairly easy to provide informative error messages
>> 
>> It is written for a subset of WHLSL (with for loops, branches, assignments, function calls, pointers, arrays and variables, but for example I omitted structures, while loops, switches, etc..). It should trivially be extendable to the rest of WHLSL if we decide to adopt it. It should be adaptable to WebSPIRV with a moderate amount of work if they require structured control flow, that is well-nested constructs like for/if instead of a soup of goto. (David said they would, the SPIRV spec seems to only require it in some cases).
>> 
>> Questions:
>> - David: does WebSPIRV requires all of its control-flow to be structured (no arbitrary goto) ?
>> - Everyone: does this kind of analysis sound acceptable? From having tried, I doubt a uniformity inference analysis can be made much simpler and remain usable. The only other option I could see would be adding a ton of uniformity annotations to the language, and then having the browser only verify them instead of trying to infer them.
>> I’d like to add a discussion of the uniformity problem and this analysis to next week’s meeting.
>> 
>> Idea
>> The general idea is to go through each function once, starting from the leaves of the call-graph, and for each one compute the conditions under which it is valid and under which its return is uniform. This computation is achieved through building a set of implications of the form “if x must be uniform, then y must be uniform”. I treat these implications as edges in a graph whose nodes correspond to variable names or points in the control-flow. Then solving this set of constraints is a simple DFS, from the nodes that trivially must always be uniform (e.g. the control-flow at the site of a call to control barrier), and we succeed if we don’t reach a node that trivially must never be uniform (e.g. the threadID).
>> In the pseudo-code below, I use integers to represent the nodes, and have two integers reserved for the special nodes “mustBeUniform” and “cannotBeUniform”.
>> A node x that trivially must always be uniform has mustBeUniform => x in the graph, and similarly a node y that trivially never must be uniform has  y => cannotBeUniform.
>> 
>> Examples
>> Let’s go through a few examples before listing the actual rules.
>> 
>> If (tid)
>> control_barrier();
>> tid is a variable which is guaranteed to not be uniform. This is represented by an implication tid => cannotBeUniform
>> The branch causes the control-flow inside the branch to depend on both the control-flow out of the loop and on the condition.
>> So if we note the control-flow out of the loop cf, and the control flow inside the loop cf2, we have cf2 => cf and cf2 => tid
>> Finally, control_barrier() is a function that requires the control-flow at each of its call points to be uniform, so mustBeUniform => cf2.
>> We have a path mustBeUniform => cf2 => tid => cannotBeUniform, so this program fragment is invalid. And if we want to provide a nice error message, we merely need to annotate each implication with its origin, and then we can say something like “the control-flow on line … must be uniform because of control_barrier() so tid on line … must be uniform, which is impossible”
>> 
>> {
>> if (tid)
>> return 42;
>> else
>> {}
>> control_barrier()
>> return 13;
>> }
>> The control-flow after a branch being uniform by default implies that the control-flow at the end of each side of the branch is uniform. So this case is very similar to the previous one: if we call the control-flow inside the branches cf2 and the control-flow after the branch cf3, we have mustBeUniform => cf3 => cf2 => tid => cannotBeUniform.
>> 
>> {
>> if (tid)
>> x = 42;
>> else
>> x = 13;
>> control_barrier()
>> return x;
>> }
>> To correctly deal with this case, we have a rule saying that if the control-flow can exit a statement in a single way, then the control-flow uniformity at the end of this statement is the same as the control-flow uniformity at its entrance. So while the uniformity in the branch depends on tid, the uniformity after the branch does not (as there is no early return/break/continue in the branches), and we just have mustBeUniform => cf and tid => cannotBeUniform, but no path going from mustBeUniform to cannotBeUniform (assuming that the control-flow at the entry of that block has no reason to be non-uniform). So this example will be correctly declared valid.
>> 
>> Things get a bit tricky with loops but follow the same kind of rules:
>> for (int i =0; i < 42; ++i) {
>> control_barrier();
>> if (tid)
>> break;
>> else
>> {}
>> }
>> In a loop, the control-flow at the entry of the body of the loop, that we will call cf2 here depends on the control-flow at the end of the body of the loop, that we will call cf3 here (as well as on the control-flow around the loop, and the condition of the loop). So cf2 => {cf3, cf, i}.
>> By the same reasoning as previously, mustBeUniform => cf2 (because of control_barrier) and cf3 => tid => cannotBeUniform (because of the branch, and control-flow does not reconverge after it because of the break). So this program is correctly called as invalid because of the path mustBeUniform => cf2 => cf3 => tid => cannotBeUniform.
>> 
>> Finally, let’s talk a bit about functions:
>> int foo(int x) {
>> int y = x + 3;
>> return y;
>> }
>> void bar(int z) {
>> if (foo(z) == 42)
>> control_barrier()
>> }
>> bar() calls foo(), so foo will be validated first.
>> It has a return of a single variable y, so its return uniformity is that of y. And y depends on x, so y => x, which is an argument. So we will remember that its return uniformity => the uniformity of its argument.
>> Then we start validating bar(). Because of the branch and the control-barrier, we have mustBeUniform => cf2 => {returnOfTheCallToFoo, cf} (where cf is the control-flow when the function is called, and cf2 is the control-flow within the if branch). And since we already validated foo(), we know that returnOfTheCallToFoo => {z, cf}, as z is the argument of foo().
>> So we mark this code as correct, but annotate bar() with the fact that it can only be called with a uniform argument and in uniform control-flow.
>> 
>> Rules
>> I put the actual rules in the following google docs to keep this email from being too long: https://docs.google.com/document/d/1dXselxEZh6VZYZi7qyE5ShEcpoba2n_IHLaMTQ3YIIQ/edit?usp=sharing <https://docs.google.com/document/d/1dXselxEZh6VZYZi7qyE5ShEcpoba2n_IHLaMTQ3YIIQ/edit?usp=sharing>
>> It should be accessible to everyone in commenting mode, feel free to either comment directly there or by answering this email.
>> 
>> Best regards,
>> Robin
>
Received on Thursday, 11 July 2019 22:29:28 UTC