- From: Corentin Wallez <cwallez@google.com>
- Date: Thu, 13 Jul 2017 14:03:59 -0400
- To: public-gpu@w3.org
- Message-ID: <CAGdfWNP3dVMZLyNexEsr=Bun1nmhJkw5X7BNdv4zPjLBTCt13A@mail.gmail.com>
GPU Web 2017-07-12 Chair: Corentin and Dean Scribe: Dean (with some help) Location: Google Hangout Minutes from last meeting <https://docs.google.com/document/d/1iqrWz9-Oo7mCfZCamzDhZE27p6wFTeLI29WirDLcEHs/edit#heading=h.hp3f2zbslxr9>Tentative agenda - Administrative stuff (if any) - Individual design and prototype status - Ongoing investigations - Queues - Things we didn’t get to yet - Pipeline state - https://github.com/jdashg/vulkan-portability/blob/master/pipeline-state.md - Render passes - Agenda for next meeting Attendance - Chris Marrin (Apple) - Dean Jackson (Apple) - Jason Aftosmis (Apple) - Julien Chaintron (Apple) - Adrian Lindberg (Apple) - Myles C. Maxfield (Apple) - Warren Moore (Apple) - Austin Eng (Google) - Corentin Wallez (Google) - Kai Ninomiya (Google) - Ken Russell (Google) - Daniel Johnston (Intel) - Ben Constable (Microsoft) - Chas Boyd (Microsoft) - Rafael Cintron (Microsoft) - Dzmitry Malyshau (Mozilla) - Jeff Gilbert (Mozilla) - Kirill Dmitrenko (Yandex) - Doug Twilleager (ZSpace) - Elviss Strazdiņš - Joshua Groves - Tyler Larson Administrative items - We’ll do our F2F on Friday the 22nd of September, in Chicago at the Google offices. - Corentin will ask on the mailing list for a list of attendees. - DJ: Checking with the W3C again, about TPAC F2F - DJ: Talked with WASM chair that is keen about having a chat, could be just a few of us going. Individual design and prototype status - Google - more backend stuff for our backend. E.g. Render Targets, etc. Most things are working. We ran into a D3D12 buffer to texture copy, regarding alignment. We’ll record this issue and talk about it when we get to copies. - JG: I looked at Pipeline states, in particular comparing D3D and Vulkan. - DM: Good progress on the graphics abstraction layer with Vulkan up to par with D3D and Metal. We have a textured quad rendering on screen. Ran into alignment constraints. Have reached out <https://gitlab.khronos.org/vulkan/vulkan/issues/920> to Khronos group for specifying the texture to buffer constraint of alignment, allowing our implementation to say it doesn’t support this feature. - BC: So Vulkan has a value for the optimal alignment? - DM: Yes it is “optimal alignment” for slice and rows of the images. But also support packed buffers. - BC: My suspicion is that on a GPU that requires an alignment, … My conjecture is that there is restriction on certain hardware that requires this alignment, and you might have to do another copy if you can’t support it directly. I suspect we’ll run into cases like this a lot. Previous APIs abstracted away the hardware operations to be consistent, but newer APIs have stripped that away, and will expose situations like this. I’m trying to nail down the limitations with the D3D team. There is an error mode I’d like to avoid: if we ignore alignment constraints, then the backend for D3D12 will require intermediate copies, which will slow it down. Meanwhile if Vulkan has a preferred alignment, it will probably do extra work if you don’t give it the right alignment. These might introduce performance penalties. My open question is whether or not WebASM will require aligned allocations in order for us to get good performance. - BC: Avoiding the copies here is one of the reasons why the new APIs are more efficient. - CW: I agree with the analysis. Can be addressed in two places: 1. The WebGPU API, or 2. We emulate stuff. I see some open source drivers using compute shaders for the buffer copies, or emulation. So I think we should just add the D3D constraints and guarantee that no expensive copies take place. - JG: +1 from me. - DM: I’m not sure Vulkan does drop back to compute shaders. - CW: Check out blorp <https://github.com/mesa3d/mesa/tree/master/src/intel/blorp> from Intel Mesa driver….. Which does have a lot of shader code. - DM: So no DMA transfer. They have to use a graphics path for the copy. - CW: I think we should embed the constraints of D3D into WebGPU. Let’s create an issue and discuss it more. - BC: D3D12 was designed to transparently expose the hardware via the API. We don’t think of it as a API limitation, but a hardware issue. Doing extra copies in some cases, or requiring the API to be aligned. 3rd party libraries seem to understand this and use intermediates themselves if necessary. - CW: We agree with this. So does Mozilla. What about Metal? - DM: Metal doesn’t have this limitation. - MM: Metal accepts anything and translates if necessary. - CW: OK. Let’s discuss on github. We generally agree about exposing the hardware constraints. - MM: Wait. If 2 of the 3 API don’t have the constraints…. - CW: Either the driver or the implementation will have to do extra copies… so we should expose the limitations so the application can be smart. - CW: Vulkan doesn’t have the limitation, but it does suggest an optimal alignment. - MM: So ArrayBuffer will have a constructor to handle alignment. - CW: It’s unclear. Maybe ArrayBuffers won’t change, but the implementation will have to do that under the hood. - KR: This is about GPU to GPU copies? - DM: It’s about GPU visible memory. - KR: We don’t need new array buffer constructors, because the memory they will show if they map buffer memory directly will be aligned by the driver itself. - DM: To clarify we are talking about the row pitch alignment, and image slice alignment. The application can create its buffer to match this constraint. - KR: The API would exposing some kind of query for this alignment. - CW: The API will guarantee at most some limits. Something queryable. - Kirill: This might also apply to HTMLImage objects. - CW: Either the image has already been decoded in GPU memory, which will be a texture to texture copy, and so it isn’t an issue. If it hasn’t been uploaded yet, then the implementation will have to do the correct thing. So it isn’t a problem. - Kirill: What about ImageBitmap, which might be a raw buffer that’s already decoded. The browser doesn’t know if it is designed to go to WebGPU. - Kirill: ImageBitmap at least exists in Firefox and Chrome. - CW: Discussion of DOM facing features can come later. Queues - CW: Last meeting we had consensus to expose async compute if available, and that queues should be requested at device creation time. - DM: We might want to know how many queues are available before setup. - CW: We should raise this in the github issue. #22 - CW: We have consensus on most things there. Pipeline state - https://github.com/jdashg/vulkan-portability/blob/master/pipeline-state.md - JG: Approach: inlined all the pipeline state of all three APIs to remove all nesting of structures, so that things are easier to see. Vulkan and D3D12 are pretty similar in what they contain and how they contain it. Metal has smaller descriptors that are less verbose, and with less “stuff” in it. No detail in what’s missing / in a different place in Metal. - JG: Some things are always dynamic and set in command lists in D3D12 (viewport scissor) and in Vulkan they are either pipeline state or set as “dynamic” then specified in the command buffer. - CW: These dynamic state things in Vulkan are for very old mobile GPUs, propose we set everything as dynamic. - JG: What I was going to propose too. Then the doc goes “state by state”, things are very similar, with Metal being less constrained in general. - JG: Viewport section: D3D12 is always using dynamic states, propose to make it always dynamic. Think about making things static in pipeline state after MVP and see with Vulkan WG. - JG: Render target formats: In D3D12 and Metal they are straightforward: array of color format, and a depth-stencil format. In Vulkan the pipeline gets a VkRenderpass and can only be used with “compatible” renderpasses. Renderpass compatibility includes the attachment formats. Basically the renderpass in the pipeline creation info is to pass the format of the attachments. - DM: So saying that we can set the format directly with a “dummy” pass to give the render target formats. - CW: I believe that renderpass compat also includes input attachments - JG: If you want to add pull request to update stuff etc. Please do! This is to collaborate. - RC: Where is the doc? - JG: Linked in the doc. - And here: https://github.com/jdashg/vulkan-portability/blob/master/pipeline-state.md - CW: On our side, we’ve added depth stencil state and found a incompat with D3D that doesn’t have per-face stencil state for some state, etc. Our WiP talks about this - we’ll either add it to your document or raise an issue. - CW: Should we raise an issue per state? Or add to the document? - JG: I think a github issue. But a PR also works. - JG: Depends if you want to add to the investigation or actually start a discussion. - CW: We’ll probably do a PR on your doc then. - CW: Our proposal is to take the intersection, and describes that. Render passes - CW: Quick summary from github. All three APIs do something slightly different. D3D just allows the render targets to be set to image queues. In Metal you specify them when you create a render command encoder. For each attachment you also specify a load or store flag. Vulkan is similar to Metal (render sub passes) - you can ask tiled GPUs to keep tile data in memory between passes. It’s the only API with this concept. - CW: My feeling is that render passes are amazing and should be in WebGPU. What do people think? - DM: I agree, as noted in issue #23. - JG: My theory is that D3D doesn’t have Render Passes because it doesn’t always run on tiled GPUs. Is this incorrect? - BC: You are correct. No hardware need for RenderPasses. - MM: What about Windows 10 on ARM? - BC: I’m not sure what I can say without checking what is public. Understand that passes have advantages on tiler, for the MVP do we want to add complexity for this? - JG: anticipate that if we put renderpasses on WebGPU, then on backend without you’d almost ignore them but is an architecture you have to fit in to work everywhere. There’d be no overhead in the translation, a bit of overhead for the application. - CW: Vulkan render passes also encode some information on memory barriers. It does seem that D3D will target GPUs that will have some form of tiling. So I suspect D3D will go in this direction in some form. - CW: So, RenderPasses are useful for most systems, are future proof, and easy enough to simulate. - RC: So how do you expect to break up command lists on D3D12? Just one giant command list? - MM: We can’t inherit state between render passes/encoders. - CW: Believe setting or clearing state at the render pass boundary will be cheap compared to the GPU cost of flushing the output merger etc. - CW: In our current prototype, we do one giant command list for the command buffer. Multiple passes are in the same graphics command list. - RC: Makes sense for subpasses. I guess it is an implementation detail. - RC: Inheritance can be left out of V1. - CW: Inheritance is orthogonal. Switching render targets is a different question. - RC: Just means you have to keep some state around. It’s a hidden cost. - CW: This was discussed in one of the github issues. I don’t have a strong opinion. - DM: I argued in favor of inheritance. - MM: The issue is symmetrical where on some backends, if there is inheritance, we’ll have to keep state around and apply it if necessary. Symmetrically D3D12 we’d have to clear state in place. - BC: Is it faster to clear or set? Is there a way to measure - CW: I think the inheritance in Vulkan is a footnote in the API. We should defer until we have a prototype. - CW: Are we going to do immediate RenderPasses, like Metal, or sub passes like Vulkan. - MM: We haven’t had anyone argue against? - CW: We still need to choose between the Metal and Vulkan styles. The Vulkan approach is more complex. - DJ: Think we should stick with the simple approach for the prototype. - MM: Describe “simple” - DJ: The metal-style things - MM: I think Dean means that the dependency data-structure has memory barriers for synchronization. - CW: And to keep data in tile-memory. - MM: How do we feel about undefined behaviour? What happens if the programmer gets this data structure wrong? - CW: We don’t have time to discuss that now. Let’s defer it to next week. CW: Please check out my slides on Vulkan render passes. It is in the github issue on the topic. It describes the benefit for tiled GPUs. We’ll talk about it next week. Agenda for next meeting - Keep talking about renderpasses / rendertarget - Get back on discussions related to pipeline state.
Received on Thursday, 13 July 2017 18:04:51 UTC