- From: David Neto <dneto@google.com>
- Date: Mon, 12 Nov 2018 19:18:26 -0500
- To: Maciej Stachowiak <mjs@apple.com>
- Cc: James Darpinian <jdarpinian@google.com>, Filip Pizlo <fpizlo@apple.com>, Kai Ninomiya <kainino@google.com>, "Myles C. Maxfield" <mmaxfield@apple.com>, Jeff Gilbert <jgilbert@mozilla.com>, Kenneth Russell <kbr@google.com>, public-gpu <public-gpu@w3.org>
- Message-ID: <CAPmVsJVNNS8hu=oVuM=JmHA982M_=E41wi07v4agPxHxCPxeDA@mail.gmail.com>
Hi. I'm very late to this thread (sorry). I have not caught up on it (double-sorry). At the moment WHLSL is unimplementable on the GPUs I know, due to oversimplification in WHLSL's memory model. I have filed 7 issues against the WHLSL spec. (#247 through #253). Some are more fundamental than others. thanks david On Mon, Nov 12, 2018 at 6:54 PM Maciej Stachowiak <mjs@apple.com> wrote: > > > On Nov 12, 2018, at 2:41 PM, James Darpinian <jdarpinian@google.com> > wrote: > > > Too much complexity is bad for both humans writing the language and > software consuming it. > > You are conflating two different kinds of complexity. Features that make > reading or writing the language less complex for humans may make the > implementation more complex and vice versa. > > > Sure, this may sometimes be true. Other times these two notions of > complexity are aligned. > > Unfortunately the evidence that you request can only really be gathered by > implementation experience. If we continue down the path of implementing > both SPIR-V and WHLSL ingestion, I doubt that will help us come to > agreement. The evidence we gather is still subject to interpretation which > we will likely still disagree on, and the more we build the more we will > have to throw away, which we will naturally be reluctant to do. > > > I share your concern. I’m not sure how we get out of this impasse. One > thing we could try is to lay out up front what sort of evidence, if > presented, would lead each side to change their position. > > However, we have a lot of evidence already available to us from previous > implementers of modern graphics APIs, who have universally chosen to > provide ingestion formats that are different from their shading languages. > > > To the extent that this is true, I don’t place a lot of weight on it. The > web is a different environment, and often the right choice for the web is > different. If you choose web formats as the reference class rather than > shader formats for modern graphics APIs, the evidence is strongly in favor > of text based formats. I think “web-based languages" is a better choice of > reference, because almost every modern technology has required significant > rethinking and adaptation for the web, and many of the lessons learned are > universal. > > > It’s also worth noting that DX12 and Metal both have not chosen to make an > exclusive ingestion format that’s different from the shading language. It’s > possible to use the actual shading language at runtime. In the case of > Metal, the binary format is not even a compile target for third parties; > the only official input point is Metal. It’s just that apps are allowed > to bundle a precompiled binary shader. Failing to directly handle a > human-authorable format at all would be the more unusual choice. > > > Regards, > Maciej > > > On Thu, Nov 8, 2018 at 11:59 PM Maciej Stachowiak <mjs@apple.com> wrote: > >> >> >> On Nov 8, 2018, at 2:51 PM, James Darpinian <jdarpinian@google.com> >> wrote: >> >> > Specifically, I don’t agree that the ingestion format can or should be >> “non-evolving” >> >> Let's put that question aside for now. I'd like to find some things we >> can all agree on. >> >> >> It’s good to find things we can agree on. It’s also important to be clear >> about what we don’t yet agree on. I’ll try to do both. >> >> Can we agree that the ingestion format and the shading language have >> different requirements that sometimes conflict, >> >> >> Depends on what you mean by “sometimes". I think I was pretty explicit >> about my position, but to state it again: >> >> - I agree that it’s possible in theory that we could find a such a >> conflict. >> - I don’t agree that we have already found one. >> - I agree that if we find a conflict, this may push us to use different >> languages for these things, if the best available compromise between the >> requirements is more harmful on net than the harm of having two separate >> languages. >> - I note that even for a single purpose of a language, there may be >> conflicting requirements that call for tradeoffs to be made. >> >> >> and in particular HLSL compatibility vs. simplicity is one of those >> conflicts? >> >> >> I don’t fully agree with this. To elaborate: >> >> * A good level of simplicity is a goal for both a human-writable shader >> language and an ingestion format. There’s a minimum level of complexity is >> set by the requirements of the domain (i.e. a shader language/format has to >> have the expressiveness and capabilities needed for shaders). Too much >> complexity is bad for both humans writing the language and software >> consuming it. >> >> * Perfect HLSL compatibility is likely not achievable for a >> human-writable shader language for the web, because regular HLSL doesn’t >> have the right safety properties. The question is how far to go in that >> direction. Being at least superficially similar is helpful for shader >> authors. Being real-world compatible with at least some HLSL shaders is >> even nicer, if it’s practical. >> >> * There is indeed some tradeoff between more HLSL compatibility and more >> complexity. But more complexity is a downside for humans too. So this >> tradeoff exists before you even consider the needs of software consuming >> the language. I suspect the best range in this tradeoff space is also a >> good spot for software ingestion needs. But I could be convinced otherwise >> by evidence. >> >> >> I guess there are some factual questions that could shed light on the >> matter: >> - Does WHLSL have good enough HLSL compatibility to allow any useful >> shaders at all to be brought over, or only enough for vague familiarity? >> - Can more compatibility be added without: >> - Violating web safety requirements? >> - Adding a level of complexity that’s bad for authors? >> - Making the language too hard to process safely and robustly? >> - If more compatibility is added, will that actually allow more real >> existing shaders to run, or would it just add a bit more familiarity? >> >> I don’t know enough about HLSL or the world of HLSL shaders out there to >> answer these questions myself. >> >> Regards, >> Maciej >> >> >> >> On Thu, Nov 8, 2018 at 1:52 PM Maciej Stachowiak <mjs@apple.com> wrote: >> >>> >>> >>> On Nov 8, 2018, at 1:09 PM, James Darpinian <jdarpinian@google.com> >>> wrote: >>> >>> > > Would you be interested in a non-evolving AST-level ingestion format? >>> > Yes, if that format is text on the wire, since that is the most >>> efficient and simple way to express an AST format. >>> >>> Perhaps there's something we can agree on here then. Can we agree that >>> the ingestion format and the shading language have different requirements >>> that sometimes conflict, e.g. compatibility with existing HLSL vs. >>> simplicity, >>> >>> >>> I agree that it *may* be true, but not that it has been shown to be >>> true on this thread so far. Specifically, I don’t agree that the ingestion >>> format can or should be “non-evolving”. It should probably evolve more >>> slowly than other web languages, and likely will regardless, due to the >>> nature of the domain. But that’s about it. >>> >>> >>> and we should, as a group, investigate making the ingestion format >>> different from the shading language to better satisfy both sets of >>> requirements? >>> >>> >>> I think we are already investigating it in that we’re considering a web >>> dialect of SPIR-V as one of the ingestion formats, and no one thinks it’s a >>> human-writable shader language. >>> >>> Whether we ultimately decide that the ingestion format is different from >>> the human-writable format remains to be seen. In my mind, it depends on if >>> we find that they actually have conflicting requirements, and that the >>> compromises necessary to satisfy both are a higher cost than having two >>> formats. >>> >>> I tend to think a single text-based language can both be an adequate >>> compiler target for other languages, still nice to write directly, and >>> secure and robust enough to use as a wire format on the web, so I’m not yet >>> convinced we need two formats. >>> >>> Regards, >>> Maciej >>> >>> >>> On Thu, Nov 8, 2018 at 8:45 AM Filip Pizlo <fpizlo@apple.com> wrote: >>> >>>> >>>> >>>> On Nov 7, 2018, at 10:57 PM, Kai Ninomiya <kainino@google.com> wrote: >>>> >>>> Maciej: You're right that comparing WHLSL with JavaScript is not a fair >>>> analogy. I mistook your statement "The evidence from WebAssembly vs >>>> JavaScript suggests this probably won’t be true" to be trying to make >>>> that analogy, but I see now that it was about a more specific point. I >>>> apologize for digging at this rathole. >>>> >>>> Filip: WebAssembly is a little hard to compare with SPIR-V since it's >>>> not SSA as you pointed out. WHLSL may be comparable to WebAssembly in that >>>> it is, in essence, an AST-level language. However, WHLSL is most definitely >>>> not at the level of WebAssembly when it comes to actual language >>>> complexity, if we are going to support existing HLSL code, >>>> >>>> >>>> I’m not sure that is true. Like WebAssembly, WHLSL just contains the >>>> low level features you need to build other things out of. >>>> >>>> The only manner in which WHLSL feels more complex to me is the addition >>>> of: >>>> >>>> - GPU style concurrency, which has more quirks than CPU style. >>>> >>>> - API for doing graphics things. WebAssembly is only concerned with the >>>> language and it has basically no api exposed to the wasm program. WHLSL has >>>> lots of spec-mandated functions exposed to the WHLSL program. >>>> >>>> So, I don’t think that WHLSL is more complex except where it absolutely >>>> has to be to do graphics. SPIR-V also has these additional complexities. >>>> >>>> and especially if we are going to add additional features (e.g. >>>> templates/generics or operator overloading) to the language. >>>> >>>> >>>> We aren’t proposing to add templates to WHLSL at this time. I think >>>> that when debating about WHLSL versus other languages, we should focus on >>>> what is being proposed rather than what might be proposed. I’m not a fan of >>>> critiquing something that might be proposed but hasn’t been proposed, since >>>> such a critique has no limiting principle - you could make up whatever you >>>> think WHLSL might have and point out that you don’t like it. >>>> >>>> WebAssembly does not need updates when C++ gains new language features, >>>> >>>> >>>> That’s not really true! WebAssembly has to evolve to support some new >>>> features like threads and maybe simd. >>>> >>>> and I think this is a strength of both WebAssembly and SPIR-V. >>>> >>>> >>>> Both of them have been revved with new stuff in the past. Both of them >>>> will probably be revved with new stuff in the future. >>>> >>>> >>>> Would you be interested in a non-evolving AST-level ingestion format? >>>> >>>> >>>> Yes, if that format is text on the wire, since that is the most >>>> efficient and simple way to express an AST format. One of the lessons I >>>> learned from wasm is that binary serialization of ASTs is really hard, and >>>> considering the time it took to reach consensus on the technique wasm ended >>>> up using, I think that it’s just simpler to use a text format. >>>> >>>> Specifically: >>>> >>>> - text formats basically mean using delimiters (like { and }) around >>>> blocks of code. If you go binary you either have to invent some other >>>> delimiter or use block headers that tell the length. From a parsing >>>> standpoint, binary is just not any better than text. >>>> >>>> - text formats are trivial to introspect. There is no need for a >>>> separate text encoding used for View Source. >>>> >>>> I think that any argument in favor of binary has to be strong enough to >>>> counterbalance text’s benefits for view source. >>>> >>>> Maybe we should discuss it. (Although, IMO, existing HLSL is already >>>> too complex to use as a WASM-level AST-style format; Inventing a new format >>>> or repurposing WASM would be painful because it gets us neither an existing >>>> tool ecosystem nor an existing application ecosystem.) >>>> >>>> >>>> WHLSL (i.e. WSL at the time) started out as more of the thing you want, >>>> since it didn’t initially have all the stuff necessary to support all of >>>> HLSL. We removed generics to make the language even simpler. >>>> >>>> In the last call, we talked about going for full HLSL compatibility. >>>> That’s making WHLSL less like the thing that you want. For example, WHLSL >>>> currently avoids some complexity by having less of the lvalue magic that C >>>> has and by having a more restrictive parser. WHLSL also uses operator >>>> overloading to make many primitive operations (like +) exist outside the >>>> language itself - the language just views + as a function call. >>>> >>>> Personally, I’d be happy with a text shader format that goes for >>>> extreme simplicity. You could imagine making some additional >>>> simplifications, like requiring that all variables are declared at the top >>>> of function. Maybe there is even more that can be done to reduce >>>> complexity. My position is that these are the good things we want in a web >>>> shader format: >>>> >>>> 1) text >>>> 2) security >>>> 3) simplicity >>>> 4) compiler target >>>> 5) similar level of abstraction to SPIR-V >>>> >>>> WHLSL currently satisfies 1, 2, 4, and 5 but may be diverging from 3 >>>> because of the desire for full HLSL compat. >>>> >>>> You could even imagine this: >>>> >>>> - WHLSL is like a kernel language (not in the sense of numerical kernel >>>> but in the sense of just having the core functionality) and doesn’t evolve >>>> much. >>>> - some other HLSL flavor has All The Features. >>>> - programmers can use WHLSL directly or they can use it as a compiler >>>> target. >>>> >>>> >>>> > Before SSA, folks used IRs with numbered temporaries like 3AC. >>>> >>>> IMO, 3AC is more like SSA than like AST when it comes to most issues, >>>> such as applying code transformations. >>>> >>>> >>>> I agree. >>>> >>>> Regardless, I agree that coming up with new variable names is not >>>> particularly problematic. >>>> >>>> On Wed, Nov 7, 2018 at 2:42 PM Filip Pizlo <fpizlo@apple.com> wrote: >>>> >>>>> >>>>> >>>>> On Nov 7, 2018, at 5:15 PM, Kai Ninomiya <kainino@google.com> wrote: >>>>> >>>>> > OpLifetimeStart and OpLifetimeEnd are instructions in the SPIR-V >>>>> language, which presumably means that lifetimes are not clearly expressed >>>>> with those instructions. Even with the addition of those instructions, they >>>>> can’t be trusted because they have to be validated, which means they could >>>>> lie. >>>>> >>>>> According to your links, OpLifetimeStart/OpLifetimeEnd are only valid >>>>> with Kernel capability (i.e. OpenCL). I would guess this is related to >>>>> physical pointers. >>>>> >>>>> > Things like plumbing bounds around with other objects would require >>>>> rewriting functions and variables and operations on those variables. It >>>>> would require generating new SSA IDs or possibly regenerating / reassigning >>>>> them >>>>> >>>>> Generating and reassigning SSA IDs is extremely simple compared with >>>>> non-SSA IDs. This is why SSA is used in modern compilers to begin with. >>>>> >>>>> >>>>> Before SSA, folks used IRs with numbered temporaries like 3AC. The >>>>> thing that SSA brings to the table is that it makes it easy to find the >>>>> definition of a variable given its use. That’s why compilers use it. If all >>>>> they wanted was an easy way to generate IDs then it’s just as easy to do >>>>> without SSA as with SSA. >>>>> >>>>> That said, I think both of you guys have a point: >>>>> >>>>> - It’s true that editing SPIR-V to insert checks will mean that you’re >>>>> not simply passing a SPIR-V blob through. You’re going to have to decode it >>>>> to an SSA object graph and then encode that graph back to a blob. >>>>> >>>>> - It’s true that SPIR-V’s use of 32-bit variable IDs makes generating >>>>> new ones straightforward. >>>>> >>>>> But I should note that since WebHLSL is not a higher order language, >>>>> generating new variable names is pretty easy. Any name not already used is >>>>> appropriate, which isn’t significantly different from finding a spare >>>>> 32-but variable ID. >>>>> >>>>> >>>>> > The evidence from WebAssembly vs JavaScript suggests this probably >>>>> won’t be true (if by “easier” you mean either “faster” or “simpler to code >>>>> correctly”). >>>>> >>>>> It sounds like you are claiming that the JavaScript parser/code >>>>> generator is not more complex than the WASM parser/code generator. Is this >>>>> correct? Can you provide evidence for this claim? >>>>> >>>>> >>>>> Depends on what you mean by complexity. And it depends on a lot of >>>>> things that are not really inherent to the languages. And it depends on >>>>> whether you account for the handicap in JS due to JS being a more complex >>>>> language in ways that have nothing to do with binary versus text. >>>>> >>>>> Without a doubt, parsing JavaScript is de facto more code than parsing >>>>> WebAssembly. This happens mostly because those parsers have been hyper >>>>> optimized over a long time (decade or more in some cases, like the one in >>>>> JSC). Maybe it’s also more code to parse JS even if you didn’t do those >>>>> optimizations, but I’m not sure we have an easy way of knowing just by >>>>> looking at an existing JS parser or wasm parser. >>>>> >>>>> What is sure is that JavaScript has better startup time than >>>>> WebAssembly. See: >>>>> https://pspdfkit.com/blog/2018/a-real-world-webassembly-benchmark/ >>>>> >>>>> So if “complexity” is about time then I don’t think that WebAssembly >>>>> wins. >>>>> >>>>> Looks like this varies by browser and it also looks like cases where >>>>> one language is faster to start than the other have more to do with the >>>>> compiler backend than parsing. >>>>> >>>>> If by complexity you mean bugs, then WebAssembly parsing has bugs as >>>>> does JS. JS parsing has less bugs for us, but that may have to do more with >>>>> JS being very mature. It may also be because parsing text is easier to get >>>>> right. >>>>> >>>>> If by complexity you mean amount of code or difficulty of code after >>>>> the parser but before the backend, then it’s unclear. WebAssembly and >>>>> JavaScript both have some quirks that implementations have to deal with >>>>> before emitting code to the backend. JSC does weird stuff to JS before >>>>> emitting bytecode and it has significant complexity in how it interprets >>>>> wasm to produce B3 IR. Also, WebAssembly opted against SSA - it’s more of >>>>> an AST serialization disguised as a stack-based bytecode than SSA. I think >>>>> wasm opted for that because dealing with something AST-like as an input was >>>>> thought to be easier than dealing with SSA as an input. >>>>> >>>>> -Filip >>>>> >>>>> >>>>> On Wed, Nov 7, 2018 at 2:11 PM Myles C. Maxfield <mmaxfield@apple.com> >>>>> wrote: >>>>> >>>>>> On Nov 6, 2018, at 3:55 PM, Jeff Gilbert <jgilbert@mozilla.com> >>>>>> wrote: >>>>>> >>>>>> I don't think it's necessarily helpful to think of this discussion as >>>>>> predominately binary vs text. >>>>>> >>>>>> I think there is a lot of value in a constrained, targeted ingestion >>>>>> format, *and separately* I think SPIR-V is a natural choice for this >>>>>> ingestion format. >>>>>> >>>>>> SPIR-V's core format is very, very easy to parse, >>>>>> >>>>>> >>>>>> SPIR-V is a sequence of 32-bit words, so you’re right that it’s easy >>>>>> to read a sequence of 32-bit words. >>>>>> >>>>>> However, a Web browser’s job is to understand any possible sequence >>>>>> of inputs. What should a browser do when it encounters two >>>>>> OpEntryPoint >>>>>> <https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#OpEntryPoint> instructions >>>>>> that happen to have the same name but different execution models? What >>>>>> happens when an ArrayStride >>>>>> <https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#Decoration> decoration >>>>>> is set to 17 bytes? What happens when both SpecId and BuiltIn decorations >>>>>> are applied to the same value >>>>>> <https://www.khronos.org/registry/spir-v/specs/unified1/SPIRV.html#_a_id_shadervalidation_a_validation_rules_for_shader_a_href_capability_capabilities_a>? >>>>>> SPIR-V today is clearly not a dream for ingestion. It is more difficult for >>>>>> a browser to understand a SPIR-V program than a WHLSL program. >>>>>> >>>>>> and lends itself >>>>>> well to simple but robust parsing. Lifetimes are clearly expressed, >>>>>> instruction invocations are very explicit, and ecosystem support is >>>>>> already good. It's a dream format for ingestion. >>>>>> >>>>>> Binning it with other (particularly older) binary formats is just >>>>>> inaccurate. Doing the initial parse gives you the structures >>>>>> (functions, types, bindings) you want pretty immediately. By >>>>>> construction, most unsafe constructs are impossible or trivially >>>>>> validatable. (SSA, instruction requirements, unsafe types, pointers) >>>>>> >>>>>> For what it's worth, text formats are technically binary formats >>>>>> with a charset. I would rather consume a constrained, >>>>>> rigidly-structured (SSA-like? s-expressions?) text-based assembly >>>>>> than some binary formats I've worked with. (DER, ugh!) >>>>>> >>>>>> Disentangling our ingestion format from the pressures of both >>>>>> redundancies and elisions that are desirable in directly-authored >>>>>> languages, simplifies things and actually prevents ambiguity. It >>>>>> immediately frees the authoring language to change and evolve at a >>>>>> faster rate, and tolerates more experimentation. >>>>>> >>>>>> I would rather solve the compilation tool distribution use-case >>>>>> without sacrificing simplicity and robustness in ingestion. A >>>>>> authoring-to-ingestion language compiler in a JS library would let us >>>>>> trivially share everything above the web-IR->host-IR translation, >>>>>> including optimization passes. >>>>>> On Tue, Nov 6, 2018 at 3:16 PM Ken Russell <kbr@google.com> wrote: >>>>>> >>>>>> >>>>>> Hi Myles, >>>>>> >>>>>> Our viewpoint is based on the experience of using GLSL as WebGL's >>>>>> input language, and dealing with hundreds of bugs associated with parsing, >>>>>> validating, and passing a textual shading language through to underlying >>>>>> drivers. >>>>>> >>>>>> Kai wrote this up at the beginning of the year in this Github issue: >>>>>> https://github.com/gpuweb/gpuweb/issues/44 , and there is a detailed >>>>>> bug list (which is still only a sampling of the associated bugs we fixed >>>>>> over the years) in this spreadsheet: >>>>>> >>>>>> https://docs.google.com/spreadsheets/d/1bjfZJcvGPI4M6Df5HC8BPQXbl847RpfsFKw6SI6_T30/edit#gid=0 >>>>>> >>>>>> Unlike what I said on the call, the main issues aren't really around >>>>>> the parsing of the input language or string handling. Both the >>>>>> preprocessor's and compiler's parsers in ANGLE's shader translator are >>>>>> autogenerated from grammars. Of more concern were situations where we had >>>>>> to semi-arbitrarily restrict the source language so that we wouldn't pass >>>>>> shaders through to the graphics driver which would crash its own shader >>>>>> compiler. Examples included having to restrict the "complexity" or "depth" >>>>>> of expression trees to avoid stack overflows in some drivers (this was >>>>>> added as an implementation-specific security workaround rather than to the >>>>>> spec), working around bugs in variable scoping and shadowing, defeating >>>>>> incorrect compiler optimizations, and more. Please take the time to read >>>>>> Kai's writeup and go through the spreadsheet. >>>>>> >>>>>> The question will come up: would using a lower-level representation >>>>>> like SPIR-V for WebGPU's shaders really address these problems? I think it >>>>>> would. SPIR-V uses SSA form and simple numbers for variables, which will >>>>>> eliminate entire classes of bugs in mishandling of language-level >>>>>> identifiers, variables, and scopes. SPIR-V's primitives are lower level >>>>>> than those in a textual shader language, and if it turns out restrictions >>>>>> on shaders are still needed in WebGPU's environment spec in order to work >>>>>> around driver bugs, they'll be easier to define more precisely against >>>>>> SPIR-V than source text. Using SPIR-V as WebGPU's shader ingestion format >>>>>> would bring other advantages, including that it's based on years of >>>>>> experience developing a portable binary shader representation, and has been >>>>>> designed in conjunction with GPU vendors across the industry. >>>>>> >>>>>> On the conference call I didn't mean to over-generalize the topic to >>>>>> "binary formats vs. text formats in the browser", so apologies if I >>>>>> misspoke. >>>>>> >>>>>> -Ken >>>>>> >>>>>> >>>>>> >>>>>> On Mon, Nov 5, 2018 at 10:58 PM Myles C. Maxfield < >>>>>> mmaxfield@apple.com> wrote: >>>>>> >>>>>> >>>>>> Hi! >>>>>> >>>>>> When we were discussing WebGPU today, the issue of binary vs text was >>>>>> raised. We are confused at the viewpoint that binary languages on the Web >>>>>> are inherently safer and more portable than text ones. All of our browsers >>>>>> accept HTML, CSS, JavaScript, binary image formats, binary font files, >>>>>> GLSL, and WebAssembly, and so we don’t understand how our teams came to >>>>>> opposite conclusions given similar circumstances. >>>>>> >>>>>> Can you describe the reasons for this viewpoint (as specifically as >>>>>> possible, preferably)? We’d like to better understand the reasoning. >>>>>> >>>>>> Thanks, >>>>>> Myles >>>>>> >>>>>> >>>>>> >>> >> >
Received on Tuesday, 13 November 2018 00:19:07 UTC