- From: Corentin Wallez <cwallez@google.com>
- Date: Mon, 30 Oct 2017 15:46:58 -0400
- To: public-gpu <public-gpu@w3.org>
- Message-ID: <CAGdfWNNSLcD-=8xSh3yUQSxCD-1seQZRZv-5kef4xc60zSopUA@mail.gmail.com>
GPU Web 2017-10-25 Chair: Corentin Scribe: Ken Location: Google Hangout Minutes from last meeting <https://docs.google.com/document/d/1N_chcF7HscK_ZaiNEGqzYR9JUgRCaxtsbE0p3wXa4Jw/> TL;DR - Status updates - Apple: to ideas, making a prototype library on top of Vulkan, soon Metal - Google: SPIR-V UB investigation, shaderc in WASM and updating nxt-chromium - Mozilla: Refactoring internals of gfx-rs - SPIR-V undefined behaviors (https://github.com/gpuweb/gpuweb/issues/34): - Important to distinguish undefined behavior and undefined values. - SPIR-V has few undefined behaviors, a bit more undefined values in math builtins. - Concern about mismatched typed loads: logical addressing mode forbifs them - Trap behavior for shader (https://github.com/gpuweb/gpuweb/issues/35) - Benchmarks show trap and clamping are close - Concern that trap would be more expensive in a deeply nested call stack. Concern that trapping doesn’t make use of hardware robust resource access. - Devs would like a trapping mechanism for debugging UB. - Consensus that deciding exact behavior should be deferred post-MVP. - Binary vs. text (or low vs. high level) - View source: discussion of whether view-source is important today with WASM and JS minifiers, and how readable SPIR-V can be. - Optimization opportunities: discussion whether high-level would allow more optimizations and that SPIR-V is still high level. - Interoperability: - Discussion that a problem with GLSL/HLSL is that it is hard to write interoperable implementations (ANGLE doesn’t help). - Suggestion that SPIR-V helps because it is much simpler and tighter. - Discussion that having a monoculture on a library would help interoperability (like ANGLE) but is bad for the Web in general. Tentative agenda - Administrative stuff (if any) - Individual design and prototype status - SPIR-V UB - Shader “trap” mechanism - Binary vs. text (or low vs. high level) - Agenda for next meeting Attendance - Apple - Dean Jackson - JF Bastien - Myles C. Maxfield - Google - Corentin Wallez - David Neto - Kai Ninomiya - Ken Russell - Ricardo Cabello - Microsoft - Chas Boyd - Rafael Cintron - Mozilla - Dzmitry Malyshau - Jeff Gilbert - Yandex - Kirill Dmitrenko - ZSpace - Doug Twilleager - Joshua Groves - Markus Siglreithmaier - Tyler Larson Administrative items - DJ: Haven’t heard back from the W3C about waiving the registration fees for a TPAC meeting Individual design and prototype status - Apple: - MM:haven’t done any work on the shading language impl. Have been working on a demonstration of an API that would have the form we’re most comfortable with. - Automatic barrier insertion, … - At a place comfortable showing it to the world - MM: going to upload it to the WebKit repo today - Just needed to write it to make sure it’s implementable on Vulkan - CW: is it a standalone API? - MM: it’s a standalone library. C++ API, not JavaScript API. Not a proposal for what to ship; just a demonstration. - Implemented on top of Vulkan. Will implement on top of Metal in the coming weeks. Going to work on shading language next. - Google: - CW: investigated undefined behavior in SPIR-V. - Kai has compiled shaderc to WebAssembly - KN: for example, can turn GLSL into SPIR-V from the web - Corentin has worked on NXT and Chromium prototype integration - Microsoft - RC: no updates - Mozilla: - DM: internal refactoring about queue factories and interaction with command pools - Changed how binds to Metal C APIs; autorelease, etc. - Updated our WebGPU prototype to newest gfx-rs hardware abstraction layer (HAL): links to various repos already posted. SPIR-V UB https://github.com/gpuweb/gpuweb/issues/34 - CW: think we understand the constraints of robust buffer access - Not looking into data races in shader execution - Not looking into mathematical precision of various operations - CW: aside from that there are a few examples of undefined behavior in SPIR-V - There’s an operator to create an “unspecified result” of an operation - JG: important to distinguish undefined behavior and undefined values. - JG: think it’s not safe. Usually talking about lack of safety of undefined behavior, up to and including program termination. Don’t conflate with taking a differing path through the program due to differing values being computed. - CW: correct. This is an undefined value, think we should remove it because we can. But not as important as other kinds. - CW: Kinds of undefined behaviors in SPIR-V: - Create a variable without an initializer. Can forbid this. - Indexing vectors with arbitrary indices; e.g. vec4 with arbitrary indices. These could compromise security. Solution: provide robust resource kinds of guarantees. - Some kinds of math operations, in particular, that return undefined values for certain inputs. - Can choose to do whatever. More undefined value than undefined behavior. - CW: conclusion: not a lot to be done to fix up these undefined behaviors. - JFB: question about things like division by zero. Allowed to terminate the program? - CW: can not terminate the program. The value is undefined. - JFB: group needs to decide whether undefined values are OK, or terminate the program. Need to define whether it’s OK to only sometimes terminate. - CW: undefined behavior we need to shield the user from for sure. Undefined values it’s not 100% clear. - MM: couldn’t find description in the SPIR-V spec of what happens if you store one type and load a different type. - KN: pointers are typed and you can’t do that statically. - CW/DN: if your memory was typed as vec3 arrays and you tried to get a pointer to the fourth component, that would be out of bounds and should have been clamped by the robust buffer access. - MM: talking about single Store/Load instructions. - DN: can’t change the pointee type. - CW: part of the logical addressing mode. Can only point to smaller, but still whole, elements of a structure. - DN: logical addressing mode states how pointers can be created. Pointer-cast operator. Load/store talk about matching pointee type to value type. - MM: can you link to validation rules? - DN: sure. - Logical addressing mode rules: https://www.khronos.org/registry/spir-v/specs/1.2/SPIRV.html#_universal_validation_rules - Specifically it bans (by omission) a cast between pointer types. - Also, in the GLSL or Simple memory model (OpMemoryModel instruction), aliasing is disallowed by default. Reference https://www.khronos.org/registry/spir-v/specs/1.2/SPIRV.html#_a_id_aliasingsection_a_aliasing - Therefore all OpVariables reference memory which is not aliased. I.e. storage backing variables is disjoint. This is arranged by the implementation. - So memory is strongly typed. - The OpLoad instruction: https://www.khronos.org/registry/spir-v/specs/1.2/SPIRV.html#OpLoad hsa validation rule: “Pointer is the pointer to load through. Its type must be an OpTypePointer whose Type operand is the same as Result Type.” - The OpStore instruction: https://www.khronos.org/registry/spir-v/specs/1.2/SPIRV.html#OpStore has the validation rule “Pointer is the pointer to store through. Its type must be an OpTypePointer whose Type operand is the same as the type of Object.” - RC: in one previous conversation Myles brought up an issue. What’s the status? - CW: that was resolved to be a spec bug and has been fixed. Will be pushed out. Shader “trap” mechanism https://github.com/gpuweb/gpuweb/issues/35 - MM: would like to clarify some points: - Not wedded to any particular solution. - Think we can all agree portability is valuable. Perhaps we have differing degrees of how far we want to go. - If we can get both portability and performance, hopefully we can all agree on it. - (Either clamping or trapping.) - If in some cases one is better and in some the other is better, can have the discussion of whether one or the other is allowed. - Tried running some little benchmark programs. - Ran on slowest piece of hardware I (Myles) could find. - In every case we tried the trap solution, (basically an if-statement and return), that ended up being faster. - If the tests are not representative, please suggest how they could be improved. - JG: are you testing valid or invalid data? - MM: valid. - CW: we don’t care so much about invalid operations. - CW: looking at the numbers, clamping vs. trapping seems fairly close. - KD: let’s look at the code first. Trap should be significantly faster. First counter that goes out of bounds, shader returns. Clamping will run all iterations. Why are the numbers even close? - MM: was only testing valid data. None of the early returns were ever hit. - KD: so for example all of the buffers were big enough? - CW: clamp and trap look very close for a trap that’s inside a top-level function. If you trap inside a deeply nested call stack how do you return from it? - MM / CW: check flags that are set. - CW: while benchmark shows that trapping is faster than clamping - JG: if we allow either clamping or trapping, we can choose which one we want. - MM: if we can find that one solution is always or almost always better, we can get some portability for free. - CW: trapping on all undefined behaviors would indeed be a portability win. - JG: we don’t necessarily need the data in order to move forward, unless we want to standardize on one solution. - CW: if we can settle on only one behavior, it’s a portability win. If not, allow either. - JG: portability win for bad programs. Not as important as maintaining correctness. - KD: trap mechanism would probably also allow us to record where the trap occurred and return that information to the developer in debug mode. - CW: yes. But that could be a debug mode where the shaders are instrumented. Probably should not be done in production. - DM: trap solution would not allow us to use robust buffer access on other APIs. Would be nice to get better performance. - RC: tend to agree with Dzmitry on this. “The fastest code is the code which doesn’t run.” - MM: need to benchmark against hardware that has robust buffer access. - CW: will be interesting to choose one or the other for V1. May be a rabbit hole for MVP. Don’t want to send Myles off on a fruitless benchmarking exercise. Defer post MVP? - MM: the issue Apple is concerned with is portability in general. Think this is important for MVP. Still, happy to defer this one point. - CW: Apple’s work on shading languages might be more important at this point. - KD: benchmarking results might be very different on different hardware. Also, in ~few years maybe all hardware will support robust buffer access. - MM: quick rebuttal: we can’t design APIs for hardware that doesn’t exist. - DM: can’t disable robust resource access on D3D12 for example. - CW: true for vertex/index buffers. Not sure about UAVs. - RC: what’s not checked is accesses off the end of the root descriptor tables. We’d need to check those in WebGPU. Prefer to let the API do the checks for you. - CB: plan is to loosen up these restrictions in future shading languages, but agree that for web standards it’s important to rely on the hardware’s support. Binary vs. text (or low vs. high level) - CW / KN: let’s go through Kai’s email and go through the things we can agree upon. - KN: probably a few points of agreement: - Compatibility with current content/examples using HLSL/GLSL - Probably don’t want to spend the time writing an entirely new thing before we can do anything else - However, perhaps modifications to HLSL can be done quickly - MM: full backwards compatibility is not a requirement for either SPIR-V or HLSL. - KN: agree. But most math that people copy/paste from web examples should work fine. - KN: not sure what else we all agree on :) - KN: start with view-source? - WebAssembly folks have given the feedback that a lot of vocal web developers complain you can’t view the source of a WebAssembly program. - Thinks maybe this is an issue with WebAssembly. - Web graphics developers are more concerned with performance - So if there is a tradeoff, should probably choose performance. - CW: does anyone expect to be able to download a WebAssembly module and see the original sources? - KN: no, but they “prefer” JavaScript because they can see the sources. - DJ: developers of WebAssembly found they needed a human readable source even just for tests. - DJ: “more concerned with performance” – does that also mean downloading the shader? - KN: yes. Download, compile, etc. If there’s a tradeoff there then think we have to take the performant option. - DJ: what about if it’s much faster on one platform? for example one that ships an HLSL compiler? - CW: different topic than view-source. - DJ: talking about performance. - KN: think we basically mean, on all platforms. - CB: two of these options might have different characteristics. For example, first frame performance vs. higher framerate. - CW: not sure why one would have a higher frame rate. - CB: sometimes have a platform-specific compiler optimization. - DJ: we’ve found that sometimes we can optimize JavaScript better than the lower-level IR. Can do some optimizations because we understand more about the program. - CB: we hear this from driver vendors too. - CW: is it a goal for the NVIDIA driver to optimize Unreal Engine’s shaders on the web? - CW: let’s talk about optimization opportunities. - SPIR-V vs. HLSL: HLSL is higher-level, therefore more optimization opportunities in backends. - CW disagrees. - KN: my second point was going to be: SPIR-V is not that low level. - 1. It’s more view-source-able than WebAssembly - 2. As far as we know it’s not any less optimizable than a high-level language. - Would be beneficial for some people who are arguing against low-level to take another look at SPIR-V. - Also would be valuable to show decompiling from SPIR-V to GLSL/HLSL. It’s intended for semantics-preserving decompilation assuming the debug info is there. - KN’s two options in the email were IR vs. SL. Argument is that they’re pretty much both at the same “level” i.e., “high-level”. - Tool does this conversion pretty well. Even if no debug info. Preserves structure like loops, function calls, etc. Modulo inlining. - JG: that’s the sort of code you’ll ship anyway. We effectively don’t have debug info for JavaScript on the web today because everyone minimizes their code. All names, context etc. is stripped. So SPIR-V would be able to reconstruct at the same basic level as JavaScript view-source on the web today. - CW/KN: agree - Example tool is SPIRV-Cross, with outputs in GLSL, HLSL, (experimental) MSL - https://github.com/KhronosGroup/SPIRV-Cross - Example outputs are in its “reference” test set: https://github.com/KhronosGroup/SPIRV-Cross/tree/master/reference - E.g. GLSL input https://github.com/KhronosGroup/SPIRV-Cross/blob/master/shaders/vert/ocean.vert - Becomes GLSL output https://github.com/KhronosGroup/SPIRV-Cross/blob/master/reference/shaders/vert/ocean.vert - MM: saying that it doesn’t work in JavaScript doesn’t mean that view-source is useless. - JG/KN: agree. - But there’s no strong pressure to do better than we already have. - DJ: you could argue that if it’s what people are shipping and what people want then we should just ship that format. - CW: that’s fair, but that’s not necessarily what people want. WebAssembly is a tradeoff based on size/performance/etc. Argue that IR is better. Every GLSL compiler has bugs. Half of Corentin’s job has been to write compiler passes to work around GLSL compiler bugs. And this is a language that has a working group, conformance tests, and reference implementation. - SPIR-V is well defined, clear spec. Easy to make interoperable implementations from it. And that’s what the web is about: interoperability. - DJ: it’s a great point. One of the points of accepting SPIR-V would be accepting the SPIR-V compiler itself. - JG: SPIR-V being a binary format and having a well-defined spec makes it a better ingestion target. The funnel of what we ingest will be smaller. - CB: you’re going to take the SPIR-V source and compile it into all browsers? - JG: yes. - CB: so the interoperable behavior will be enforced by compiling the same SPIR-V compiler into browsers? - JG: would like to avoid having a monoculture. - MM: why will not SPIR-V have the same problems as high-level languages? - CW: because things are better defined in the binary format. Things like initializer expressions (or lack thereof) in HLSL. Scoping rules in if and switch statements. - JG: Some things are easier to parse than others. For example C has a monstrous grammar and is extremely is to make errors in. - DJ: it’s not our job just to make our lives easier. It’s our job to make web developers’ lives easier. - JG: agree. But it’ll be easier to follow through on our promises to developers if we choose an easier compilation target. - KN: agree. We’ll reach something sufficiently stable sooner, and maybe, ever. If the project’s simpler, it’s more likely that we’ll get it right. - CB: it seems that the way people are getting uniform behavior is that everyone uses ANGLE. - CW: ANGLE has tons of bugs too. Have been fuzzing it. - JG: not having a monoculture in WebGL – multiple implementations – has pushed everyone forward. Has materially pushed ANGLE forward too. - CB: but what if we’ve found and fixed all of the bugs in ANGLE? - CW: what about multiple versions of ANGLE in different browsers? - CB: not sure what expected release cadence is. - CW: based on the releases of the browsers. - JG: working on 6-week development cycle. - DJ: we should follow up on Kai’s email. Also a Github issue about accepting HLSL as a proposal. - CW: putting the cart before the horse. - JG: we’ll discuss on the mailing list. - DJ: would like to see what developers would prefer. This is as important to me as view-source. - CW: let’s discuss on the mailing list. Agenda for next meeting - Been stuck on memory barriers for a while? - JG / MM: “anything else” than this. - MM: one interesting bit when Myles publishes Apple’s API today to WebKit: how to do buffer and texture updates. Has to be some way to schedule a buffer update (from CPU to GPU) so that it doesn’t get updated while it’s in use by the GPU. - CW: let’s talk about data upload/download. Something we talked about a while ago. Keep it light. - RC: think we should continue this discussion too in the future though. Had some things to day. - MM: agree, can’t ship anything until this is resolved. - CW: memory barriers and shading languages are two big topics. Can’t ship without resolution.
Received on Monday, 30 October 2017 19:47:50 UTC