- From: Corentin Wallez <cwallez@google.com>
- Date: Mon, 12 Feb 2018 14:41:43 -0500
- To: public-gpu <public-gpu@w3.org>
- Message-ID: <CAGdfWNM+4ZuH-wbbDNGHHgGeZra7RYdQ7sO1_cYeqWKcXKtUqg@mail.gmail.com>
For context, this is a reply to Dzmitry's Feasibility of low-level GPU access on the Web <http://kvark.github.io/web/gpu/2018/02/10/low-level-gpu-web.html> in slightly less public forum. Dzmitry, First of all, congrats on reaching #1 on Hacker News with this post :) It is an interesting view of your thoughts on the subjects. Below are a bunch of comments. *APIs as local maxima of a design space* Vulkan isn't exactly a local maxima, see the various KHR_maintenance extensions and for example loadOp/storeOp being in the VkRenderPassCreateInfo instead of the VkRenderPassBeginInfo. That said it is pretty close to a local maximum for what it tries to achieve: providing developers with console-like access to the hardware, over a broad range of GPUs. Itself Metal is a local maximum (or close to it) for an easy to use API that's extremely efficient on Apple GPUs and runs well on AMD/Intel/Nvidia GPUs. NXT's goals are different as it tries to be portable, efficient, and unbiased over D3D12, Metal and Vulkan. In addition it tries to be efficient to validate by design and where possible not expose too many flags to application developers. That's a lot of constraints and NXT is the first step of the gradient decent that would lead to a local maximum for these goals. It has had maybe less than 1-engineer-year of work overall so of course it isn't as refined as the native APIs. To keep the mathematical metaphor, the cost function used to evaluate the APIs depends on what your goals are. D3D12. Metal and Vulkan are local maxima for their respective cost functions but none of them are particularly good when evaluated with the WebGPU cost function. For example this is because WebGPU needs to be unbiased towards native APIs. *Being unbiased is a key property of WebGPU* Being unbiased is key here. The reason we are at W3C is so that we can have all browser vendors at the table to design WebGPU. It turns out that this makes the owners, or major stakeholders, of all native APIs discuss what WebGPU should look like. Obviously if someone suggests something too biased towards on native API, it will get push back from the vendors of the other native APIs. We'll all have to do an exercise in compromising if we want to ship a WebGPU v1. Being unbiased isn't only important for political reasons. We want the translation from WebGPU to the native APIs to be as thin as possible to avoid expensive emulation. Each native API has features that would be expensive to emulate on others. Examples are D3D12's descriptor heaps, Metal's implicit barriers and Vulkan's memory aliasing (esp. for host visible resources). The set of unbiased designs is far from the existing native APIs and to keep on the mathematical metaphors, it would probably be somewhere around the barycenter of them. A WebVulkan would be far from the barycenter and that's why we don't believe it can happen. Google recognized this "unbiased" constraint early on and presented NXT as a first approximation of where the barycenter is. This was in January 2017 even before the W3C group was started. We now feel this design was a step in the good direction because we have backends on all native APIs (Vulkan is being completed now) that are all surprisingly thin, each being 2-4 kloc. That said we still think of NXT as just a prototype of what a WebGPU design could be and do not want it to become WebGPU as is. It's just an API that efficient-ish, useable-ish and close-ish to the barycenter of the native APIs. So a good first step of gradient descent but that's it. We'll upload IDL for our "sketch API" shortly. *Performance measurement of memory barriers* One of the concerns we have with implicit memory barriers is that a WebGPU library on top of D3D12 / Vulkan doesn't have the same knowledge of the hardware that a Metal driver does. The exact layout of hardware caches, required metadata and other things are hidden behind the D3D12/Vulkan interface. This is an issue if for example the UBO and texture caches are the same: Metal would know that after discarding one it doesn't need to discard the other but WebGPU on D3D12/Vulkan wouldn't and would discard twice. Hardware layout differences like this can even happen on hardware by the same vendor: Nvidia Kepler has split caches like in the example, but Maxwell unified them. So I think the tests done should be what you said but on Windows the application should be written with a Metal-like API translated to D3D12/Vulkan, and on OSX it should be a application using a Vulkan-like interface that gets translated to Metal. > *G*: let’s only share resources with immutable access flags then I'm not sure what that means but you can "transition the usage" of a resource after moving it to a queue. *Conclusion* Yes we are suggesting the group makes something different from existing APIs which would make it a 4th API. Hopefully you now agree with our reasoning as described above, or at least understand it. We are not discarding existing research and designs, but using them to make something fitting the needs of the Web.
Received on Monday, 12 February 2018 19:42:31 UTC