- From: Corentin Wallez <cwallez@google.com>
- Date: Mon, 23 Oct 2017 14:03:03 -0400
- To: public-gpu <public-gpu@w3.org>
- Message-ID: <CAGdfWNMqaq7RT_AySgs1U3tC9i26ZRqxY7dUpDMtj5HUkA4_ng@mail.gmail.com>
GPU Web 2017-10-18 Chair: Corentin Wallez Scribe: Kirill, Kai Location: Google Hangout Minutes from last meeting <https://docs.google.com/document/d/1-ciiWbGletoOXOGrBBZpOhhWEWJcIqK0Bb-dtHJ8ffE> TL;DR - TPAC slot to chat with the WASM CG is Nov 7th, 2PM-3:30PM - Status updates - Apple working on WSL <-> SPIRV translation, making WSL it a superset of HLSL - Google searching for UB in SPIR-V, updating their Chromium prototype for TPAC - Mozilla has all gfx-rs backends ingesting SPIR-V, working on some testing. - Use cases for synchronization - Discussion on Vulkan render-passes - Vulkan-sylte render passes require giving (render pass, subpass) at pipeline creation time. Concern that it isn’t the right API and suggestion to have single-subpass for MVP. - Counter-point is that Metal is more flexible because it only cares about one GPU vs. WebGPU. Tiling control is important for mobile, and Vulkan renderpasses provide an overall dependency graph to WebGPU. - More homework needed to know tiling control is a structural issue to figure out for the MVP or not. - Discussion around a new example synchronization use-case on the email thread - Discussion that Metal knows more about the hardware than WebGPU will, which makes implicit barriers easier to implement. - Should have the discussion in writing on email threads. Tentative agenda - Administrative stuff (if any) - Individual design and prototype status - Use cases for synchronization - Agenda for next meeting Attendance - Apple - Dean Jackson - Myles C. Maxfield - Theresa O'Connor - Google - Corentin Wallez - John Kessenich - Kai Ninomiya - Microsoft - Rafael Cintron - Mozilla - Dzmitry Malyshau - Jeff Gilbert - Yandex - Kirill Dmitrenko - ZSpace - Doug Twilleager - Elviss Strazdiņš - Joshua Groves - Markus Siglreithmaier Administrative items - CW: Slot scheduled for meeting with WASM CG at TPAC - Tuesday Nov 7th. 2PM-3:30PM - Let Corentin know if you’re coming - DJ: No update on waiving registration fee to attend a 2 hour TPAC meeting. - DJ: Software license agreement - Basically just waiting on last signoff Individual design and prototype status - Apple: - MM: Starting to implement an API that could look like what WebGPU would look like. Implementing a Vulkan backend first because it is the hardest backing API. - MM: Got us to understand some of other participant’s concerns. - MM: Regarding our WSL/HLSL implementation. Like last week we have pieces of a codegen phase from WSL / HLSL to SPIR-V. Have an idea of what to do to make our JS implementation accept HLSL. - Google: - CW: Started to read SPIR-V spec thoroughly to get the whole picture. Going to look at all undef behaviours and try to classify them. From last meeting about concerns about OpPhi - JK: There’ll be validation for that - CW: Demo of something that looks like WebGPU from NXT to show on TPAC - Microsoft: - RC: Got answer from D3D team about synchronization email. - RC: Also talked to lawyers about SPIR-V licensing - Mozilla: - DM: to CW: is last version of NXT public? - CW: Links will be in the maling list. NXT is in Google GH organization. Once legal stuff sorted out, all NXT code wiil be in WebGPU GH org. - DM: Got SPIR-V to MSL working, so SPIR-V to all modern backends working. Basic testing for regressions. - CW: Manually? - DM: Automation is planned, right now it’s manual. Use cases for synchronization - RC: I haven’t the latest update. D3D team said that currently listed examples are fine, besides cases for tiled architectures (such as input attachments) - CW: Would be great if D3D team could look at our API designs and make sure they’ll work with future D3D tiler support - RC: Ok - CW: @MM: feedback on synchronization use cases? - MM: Was talking about how Vulkan pipelines require you to specify a renderpass - MM: Making pipeline state obj in Vulkan require a compatible render pass (i.e. compatible attachment formats). We were looking at this in case of a situation when you don’t know format of attachments before rendering started. (?). - MM: Missing piece is subpass state - CW: creating a pipeline before you have an encoder? - MM: Vulkan requires knowing all the subpasses and Metal doesn’t. You make a pipeline obj and it represent kinda subpass. In Vulkan everything needs to be created explicitly. - CW: Vulkan compile pipelines using knowledge about subpasses (important on tiled GPUs) - MM: In our ideal world where you don’t have to specify render subpass up front at pipeline creation, every renderpass has exactly one subpass in it, so when you create pipeline state you know how to fill in those fields in the Vulkan backend - CW: Vulkan style is ugly but if we don’t use it then we give up control over tile locality. Metal 2 adds more capabilities related to tile control. D3D12 might add these too. - JG: Why should we give up performance on Vulkan and Metal 2 just because Metal doesn’t expose it? - MM: I’m arguing this isn’t the right API design to achieve that performance. We shouldn’t settle on Vulkan’s over Future-D3D or Metal 2. - JG: Understood. Concerns about using Vulkan style over Metal/2? - MM: Yes, developer experience. - MM: Proposal is one subpass per render pass. - MM: Only for MVP - DM: gain on Mobile from render sub-passes: Vulkan Game Development on Mobile <https://www.youtube.com/watch?v=y-EBiswp3qU> - CW: If we want to exclude this from the MVP, we need to make sure it won’t affect structure of the rest of the API. - JG: If D3D is going to get tile control then maybe we need to defer designing this until we can design with D3D in mind. But not putting it in the MVP seems hazardous. - DJ: Arguing for cleaner API for MVP. Later decide how to go forward with tile control, including multiple render subpasses and explicit synchronization. - CW: Render target scope instead of multiple subpasses maybe okay? Your argument was that specifying subpasses is complicated. Apple only has to deal with one tiled architecture in Metal right now. But Vulkan, and maybe Future-D3D, may need more info (e.g. pipelines’ subpasses) - JG: Making this explicit forces developer to think about and provide as much info as might be useful on tiled, even when developing on desktop at first. - DJ: Developer on a desktop GPU might never care about this at all. - JG - DJ: Basically we just disagree on whether we should make it simpler or more explicit. - CW: AMD, which is a desktop GPU, says render passes are useful there too https://gpuopen.com/vulkan-renderpasses/ - DM: Makes sense to have explicit dependencies, e.g. on D3D12 we can generate barriers by analyzing the subpass dependencies. On Metal (2?) something (?) - RC: On Metal 2 and Vulkan, how different are the tile control stories? How tractable is it to intersect them? - MM: Two answers: In a model with one subpass per renderpass in vulkan, similar. If concerned with good tile control, difficult and should wait for info on D3D. - CW: Ok, makes sense. Everyone should do homework and determine how much it affects structure so we can determine whether it’s okay to exclude it from MVP. - CW: The use cases that came from Vulkan. Good to think about, but don’t inform the implicit vs explicit memory barrier debate we have. They seem to all work well on both explicit and implicit. - CW: But came up with a new use case that doesn’t work well on implicit. - CW: Anyone have topics wrt the original use cases list? - [No] - CW: Was trying to think of why Metal can be more successful than us. - CW: Metal knows whether the hardware implements the barrier for a single resource or globally. - CW: For WebGPU we don’t know, so we don’t know whether it’s better to have individual barriers per resource (allows driver to do better scheduling) or aggregate barriers into one big barrier (prevents excess global stalls) - MM: Seems same as your old argument from a few months ago. - MM: Vulkan has two places with sync: inside render subpass dep graph and inserting barriers. If pass has one subpass and command buffer has one renderpass, that allows you to have optimal implicit barriers. - CW: If we’re looking to have tile control, there’ll be some complex mapping between passes. - CW: Even with your constraints we can’t do optimal barrier placement in D3D12/Vulkan. - MM: On API level if you look at commands, you know state of resources (prev and needed) and therefore you know exact set of barriers. - CW: That algorithm is nontrivial and is hardware dependent. API can’t see across separate queue submissions. - MM: On first point: the UA knows better about hardware than app developer. - CW: App knows what it’s going to do. - MM: You know for every pair of commands that interact with the resource, you know the old and new state, so you can schedule the barriers as needed. - CW: I argue that the algorithm is possible but nontrivial. - RC: Can you give an example of the application doing better than the API? - CW: You can’t look across queue submits (and that makes it complicated), so you might need to do pessimistic barrier placement. In application you can (given domain knowledge) you can group barriers (e.g. different buffers all become UBOs) - CW: Difficult spoken conversation. - MM: Agree. Question: Why would anybody submit more than one queue per frame? Each frame already has a full flush anyway so one queue per frame is okay. - CW: Cache flush? - MM: Yes. - CW: Don’t necessarily agree - hardware dependent. Reason for multiple flushes: e.g. multiple queues, e.g. VR is latency sensitive. - CW: Let’s continue in writing so we can think about it more in depth. Agenda for next meeting - Shading languages! - CW: Undef behaviours in SPIR-V - CW: Others? - MM: Could talk briefly about our “trap” idea. - MM: Working on emitting SPIR-V to prove equivalence, working on moving to HLSL. - MM: But we have nothing that’s blocking the shading language discussion. - DJ: high- vs low-level debate? - CW: SL vs IR, text vs binary, etc. - CW: Schedule at the end, will take forever.
Received on Monday, 23 October 2017 18:04:26 UTC