Minutes for the 2017-07-12 meeting

GPU Web 2017-07-12

Chair: Corentin and Dean

Scribe: Dean (with some help)

Location: Google Hangout
Minutes from last meeting
<https://docs.google.com/document/d/1iqrWz9-Oo7mCfZCamzDhZE27p6wFTeLI29WirDLcEHs/edit#heading=h.hp3f2zbslxr9>Tentative
agenda

   -

   Administrative stuff (if any)


   -

   Individual design and prototype status


   -

   Ongoing investigations
   -

      Queues
      -

   Things we didn’t get to yet
   -

      Pipeline state
      -


         https://github.com/jdashg/vulkan-portability/blob/master/pipeline-state.md
         -

      Render passes
      -

   Agenda for next meeting

Attendance

   -

   Chris Marrin (Apple)


   -

   Dean Jackson (Apple)
   -

   Jason Aftosmis (Apple)
   -

   Julien Chaintron (Apple)
   -

   Adrian Lindberg (Apple)
   -

   Myles C. Maxfield (Apple)
   -

   Warren Moore (Apple)
   -

   Austin Eng (Google)
   -

   Corentin Wallez (Google)
   -

   Kai Ninomiya (Google)
   -

   Ken Russell (Google)
   -

   Daniel Johnston (Intel)
   -

   Ben Constable (Microsoft)
   -

   Chas Boyd (Microsoft)
   -

   Rafael Cintron (Microsoft)
   -

   Dzmitry Malyshau (Mozilla)
   -

   Jeff Gilbert (Mozilla)
   -

   Kirill Dmitrenko (Yandex)
   -

   Doug Twilleager (ZSpace)
   -

   Elviss Strazdiņš
   -

   Joshua Groves
   -

   Tyler Larson

Administrative items

   -

   We’ll do our F2F on Friday the 22nd of September, in Chicago at the
   Google offices.
   -

   Corentin will ask on the mailing list for a list of attendees.
   -

   DJ: Checking with the W3C again, about TPAC F2F
   -

   DJ: Talked with WASM chair that is keen about having a chat, could be
   just a few of us going.

Individual design and prototype status

   -

   Google - more backend stuff for our backend. E.g. Render Targets, etc.
   Most things are working. We ran into a D3D12 buffer to texture copy,
   regarding alignment. We’ll record this issue and talk about it when we get
   to copies.
   -

   JG: I looked at Pipeline states, in particular comparing D3D and Vulkan.
   -

   DM: Good progress on the graphics abstraction layer with Vulkan up to
   par with D3D and Metal. We have a textured quad rendering on screen. Ran
   into alignment constraints. Have reached out
   <https://gitlab.khronos.org/vulkan/vulkan/issues/920> to Khronos group
   for specifying the texture to buffer constraint of alignment, allowing our
   implementation to say it doesn’t support this feature.
   -

   BC: So Vulkan has a value for the optimal alignment?
   -

   DM: Yes it is “optimal alignment” for slice and rows of the images. But
   also support packed buffers.
   -

   BC: My suspicion is that on a GPU that requires an alignment, … My
   conjecture is that there is restriction on certain hardware that requires
   this alignment, and you might have to do another copy if you can’t support
   it directly. I suspect we’ll run into cases like this a lot. Previous APIs
   abstracted away the hardware operations to be consistent, but newer APIs
   have stripped that away, and will expose situations like this. I’m trying
   to nail down the limitations with the D3D team. There is an error mode I’d
   like to avoid: if we ignore alignment constraints, then the backend for
   D3D12 will require intermediate copies, which will slow it down. Meanwhile
   if Vulkan has a preferred alignment, it will probably do extra work if you
   don’t give it the right alignment. These might introduce performance
   penalties. My open question is whether or not WebASM will require aligned
   allocations in order for us to get good performance.
   -

   BC: Avoiding the copies here is one of the reasons why the new APIs are
   more efficient.
   -

   CW: I agree with the analysis. Can be addressed in two places: 1. The
   WebGPU API, or 2. We emulate stuff. I see some open source drivers using
   compute shaders for the buffer copies, or emulation. So I think we should
   just add the D3D constraints and guarantee that no expensive copies take
   place.
   -

   JG: +1 from me.
   -

   DM: I’m not sure Vulkan does drop back to compute shaders.
   -

   CW: Check out blorp
   <https://github.com/mesa3d/mesa/tree/master/src/intel/blorp> from Intel
   Mesa driver….. Which does have a lot of shader code.
   -

   DM: So no DMA transfer. They have to use a graphics path for the copy.
   -

   CW: I think we should embed the constraints of D3D into WebGPU. Let’s
   create an issue and discuss it more.
   -

   BC: D3D12 was designed to transparently expose the hardware via the API.
   We don’t think of it as a API limitation, but a hardware issue. Doing extra
   copies in some cases, or requiring the API to be aligned. 3rd party
   libraries seem to understand this and use intermediates themselves if
   necessary.
   -

   CW: We agree with this. So does Mozilla. What about Metal?
   -

   DM: Metal doesn’t have this limitation.
   -

   MM: Metal accepts anything and translates if necessary.
   -

   CW: OK. Let’s discuss on github. We generally agree about exposing the
   hardware constraints.
   -

   MM: Wait. If 2 of the 3 API don’t have the constraints….
   -

   CW: Either the driver or the implementation will have to do extra
   copies… so we should expose the limitations so the application can be smart.
   -

   CW: Vulkan doesn’t have the limitation, but it does suggest an optimal
   alignment.
   -

   MM: So ArrayBuffer will have a constructor to handle alignment.
   -

   CW: It’s unclear. Maybe ArrayBuffers won’t change, but the
   implementation will have to do that under the hood.
   -

   KR: This is about GPU to GPU copies?
   -

   DM: It’s about GPU visible memory.
   -

   KR: We don’t need new array buffer constructors, because the memory they
   will show if they map buffer memory directly will be aligned by the driver
   itself.
   -

   DM: To clarify we are talking about the row pitch alignment, and image
   slice alignment. The application can create its buffer to match this
   constraint.
   -

   KR: The API would exposing some kind of query for this alignment.
   -

   CW: The API will guarantee at most some limits. Something queryable.
   -

   Kirill: This might also apply to HTMLImage objects.
   -

   CW: Either the image has already been decoded in GPU memory, which will
   be a texture to texture copy, and so it isn’t an issue. If it hasn’t been
   uploaded yet, then the implementation will have to do the correct thing. So
   it isn’t a problem.
   -

   Kirill: What about ImageBitmap, which might be a raw buffer that’s
   already decoded. The browser doesn’t know if it is designed to go to WebGPU.
   -

   Kirill: ImageBitmap at least exists in Firefox and Chrome.
   -

   CW: Discussion of DOM facing features can come later.

Queues

   -

   CW: Last meeting we had consensus to expose async compute if available,
   and that queues should be requested at device creation time.
   -

   DM: We might want to know how many queues are available before setup.
   -

   CW: We should raise this in the github issue. #22
   -

   CW: We have consensus on most things there.

Pipeline state

   -


   https://github.com/jdashg/vulkan-portability/blob/master/pipeline-state.md


   -

   JG: Approach: inlined all the pipeline state of all three APIs to remove
   all nesting of structures, so that things are easier to see. Vulkan and
   D3D12 are pretty similar in what they contain and how they contain it.
   Metal has smaller descriptors that are less verbose, and with less “stuff”
   in it. No detail in what’s missing / in a different place in Metal.
   -

   JG: Some things are always dynamic and set in command lists in D3D12
   (viewport scissor) and in Vulkan they are either pipeline state or set as
   “dynamic” then specified in the command buffer.
   -

   CW: These dynamic state things in Vulkan are for very old mobile GPUs,
   propose we set everything as dynamic.
   -

   JG: What I was going to propose too. Then the doc goes “state by state”,
   things are very similar, with Metal being less constrained in general.
   -

   JG: Viewport section: D3D12 is always using dynamic states, propose to
   make it always dynamic. Think about making things static in pipeline state
   after MVP and see with Vulkan WG.
   -

   JG: Render target formats: In D3D12 and Metal they are straightforward:
   array of color format, and a depth-stencil format. In Vulkan the pipeline
   gets a VkRenderpass and can only be used with “compatible” renderpasses.
   Renderpass compatibility includes the attachment formats. Basically the
   renderpass in the pipeline creation info is to pass the format of the
   attachments.
   -

   DM: So saying that we can set the format directly with a “dummy” pass to
   give the render target formats.
   -

   CW: I believe that renderpass compat also includes input attachments
   -

   JG: If you want to add pull request to update stuff etc. Please do! This
   is to collaborate.
   -

   RC: Where is the doc?
   -

   JG: Linked in the doc.
   -

   And here:
   https://github.com/jdashg/vulkan-portability/blob/master/pipeline-state.md
   -

   CW: On our side, we’ve added depth stencil state and found a incompat
   with D3D that doesn’t have per-face stencil state for some state, etc. Our
   WiP talks about this - we’ll either add it to your document or raise an
   issue.
   -

   CW: Should we raise an issue per state? Or add to the document?
   -

   JG: I think a github issue. But a PR also works.
   -

   JG: Depends if you want to add to the investigation or actually start a
   discussion.
   -

   CW: We’ll probably do a PR on your doc then.
   -

   CW: Our proposal is to take the intersection, and describes that.

Render passes

   -

   CW: Quick summary from github. All three APIs do something slightly
   different. D3D just allows the render targets to be set to image queues. In
   Metal you specify them when you create a render command encoder. For each
   attachment you also specify a load or store flag. Vulkan is similar to
   Metal (render sub passes) - you can ask tiled GPUs to keep tile data in
   memory between passes. It’s the only API with this concept.
   -

   CW: My feeling is that render passes are amazing and should be in
   WebGPU. What do people think?
   -

   DM: I agree, as noted in issue #23.
   -

   JG: My theory is that D3D doesn’t have Render Passes because it doesn’t
   always run on tiled GPUs. Is this incorrect?
   -

   BC: You are correct. No hardware need for RenderPasses.
   -

   MM: What about Windows 10 on ARM?
   -

   BC: I’m not sure what I can say without checking what is public.
   Understand that passes have advantages on tiler, for the MVP do we want to
   add complexity for this?
   -

   JG: anticipate that if we put renderpasses on WebGPU, then on backend
   without you’d almost ignore them but is an architecture you have to fit in
   to work everywhere. There’d be no overhead in the translation, a bit of
   overhead for the application.
   -

   CW: Vulkan render passes also encode some information on memory
   barriers. It does seem that D3D will target GPUs that will have some form
   of tiling. So I suspect D3D will go in this direction in some form.
   -

   CW: So, RenderPasses are useful for most systems, are future proof, and
   easy enough to simulate.
   -

   RC: So how do you expect to break up command lists on D3D12? Just one
   giant command list?
   -

   MM: We can’t inherit state between render passes/encoders.
   -

   CW: Believe setting or clearing state at the render pass boundary will
   be cheap compared to the GPU cost of flushing the output merger etc.
   -

   CW: In our current prototype, we do one giant command list for the
   command buffer. Multiple passes are in the same graphics command list.
   -

   RC: Makes sense for subpasses. I guess it is an implementation detail.
   -

   RC: Inheritance can be left out of V1.
   -

   CW: Inheritance is orthogonal. Switching render targets is a different
   question.
   -

   RC: Just means you have to keep some state around. It’s a hidden cost.
   -

   CW: This was discussed in one of the github issues. I don’t have a
   strong opinion.
   -

   DM: I argued in favor of inheritance.
   -

   MM: The issue is symmetrical where on some backends, if there is
   inheritance, we’ll have to keep state around and apply it if necessary.
   Symmetrically D3D12 we’d have to clear state in place.
   -

   BC: Is it faster to clear or set? Is there a way to measure
   -

   CW: I think the inheritance in Vulkan is a footnote in the API. We
   should defer until we have a prototype.
   -

   CW: Are we going to do immediate RenderPasses, like Metal, or sub passes
   like Vulkan.
   -

   MM: We haven’t had anyone argue against?
   -

   CW: We still need to choose between the Metal and Vulkan styles. The
   Vulkan approach is more complex.
   -

   DJ: Think we should stick with the simple approach for the prototype.
   -

   MM: Describe “simple”
   -

   DJ: The metal-style things
   -

   MM: I think Dean means that the dependency data-structure has memory
   barriers for synchronization.
   -

   CW: And to keep data in tile-memory.
   -

   MM: How do we feel about undefined behaviour? What happens if the
   programmer gets this data structure wrong?
   -

   CW: We don’t have time to discuss that now. Let’s defer it to next week.


CW: Please check out my slides on Vulkan render passes. It is in the github
issue on the topic. It describes the benefit for tiled GPUs. We’ll talk
about it next week.
Agenda for next meeting

   -

   Keep talking about renderpasses / rendertarget
   -

   Get back on discussions related to pipeline state.

Received on Thursday, 13 July 2017 18:04:51 UTC