Minutes for the 2017-06-07 meeting from Corentin Wallez on 2017-06-08 (public-gpu@w3.org from June 2017)

From: Corentin Wallez <cwallez@google.com>
Date: Thu, 8 Jun 2017 09:43:55 -0400
To: public-gpu@w3.org
Message-ID: <CAGdfWNMD3Kjz-7qk2ZG02-Nn=nG2f8gpJz0PZ+CTdppH-8odWg@mail.gmail.com>

GPU Web 2017-06-07

Chair: Corentin Wallez & Dean Jackson

Scribe: Dean (with help)

Location: Google Hangout
Minutes from last meeting
<https://docs.google.com/document/d/1F5ysx9tuY6NPzbj_Z55rTdYW_mjFKkFabQQqc22NlG8/edit#heading=h.hp3f2zbslxr9>Tentative
agenda

Administrative stuff (if any)
-

Individual design and prototype status
-

Base explicit API concepts (pipelines, command buffers, queues)
-

Other topics we didn’t get to last meeting

Render-targets / render passes
-

Stuff that doesn’t go in (XFB?)
-

How to reach consensus on WebVulkan vs the alternative

Agenda for next meeting

Attendance

Chris Marrin (Apple)
-

Dean Jackson (Apple)
-

Jason Aftosmis (Apple)
-

Julien Chaintron (Apple)
-

Myles C. Maxfield (Apple)
-

Theresa O'Connor (Apple)
-

Gareth Morgan (Axum Graphics)
-

Austin Eng (Google)
-

Corentin Wallez (Google)
-

Kai Ninomiya (Google)
-

Ken Russell (Google)
-

Ricardo Cabello (Google)
-

Zhenyao Mo (Google)
-

Daniel Johnston (Intel)
-

Ben Constable (Microsoft)
-

Rafael Cintron (Microsoft)
-

Dzmitry Malyshau (Mozilla)
-

Jeff Gilbert (Mozilla)
-

Kirill Dmitrenko (Yandex)
-

Doug Twilleager (ZSpace)
-

Elviss Strazdiņš
-

Joshua Groves
-

Tyler Larson

Administrative items

None

Individual design and prototype status

CW: What are members up to with their prototype?
-

CW: We’ve kept prototyping NXT. We have tests running on precommit and
the validation code, but nothing actually on the GPU. We’ve started on
RenderPasses, starting on DepthStencil and mapping buffer for reading.
-

CW: Austin has started on the D3D12 backend got HelloTriangle.
-

DM: No update on Mozilla’s prototype. Given the recent discussion on
binding models, I’m hoping we can investigate some of the decisions made on
Issue 19 to see if they are valid.
-

JG: I started writing a skeleton for Vulkan to D3D12.
-

DJ: Nothing much on Apple’s WebGPU. Will look at changing the API to
look a bit more like Vulkan with descriptor sets.
-

JG: Will you change anything because of the introduction of Metal 2?
-

DJ: Metal2 available on High Sierra, so it would only work for people on
the Beta. While Metal2 would have advantages, we might not use it for a
while.
-

BC: any resources on Metal 2?
-

DJ: look at WWDC videos, maybe 5-10 min segment on indirect argument
buffers
-

DM: The Metal2 slides
<https://devstreaming-cdn.apple.com/videos/wwdc/2017/601nzg4idodih/601/601_introducing_metal_2.pdf>
-

CW: if you want to look at indirect argument buffers, there’s a link
<https://developer.apple.com/documentation/metal/fundamental_components/gpu_resources/understanding_indirect_argument_buffers>
in the binding model issue #19
<https://github.com/gpuweb/gpuweb/issues/19> on Github that point to the
Metal docs.

Pipeline state

CW: Let’s talk about the basic stuff that isn’t the binding model.
-

CW: All APIs have pipeline states that are separated between graphics
and compute. How much do we want to put in there?
-

JG: What are the notable differences?
-

CW: D3D12 can do things like binding the depth stencil state outside the
pipeline, but Vulkan is inside. Overall, the Vulkan pipeline state is
bigger than the others. This means we’ll have to decide if we’re ok with
having to recompile on Vulkan.
-

DM: The cache objects in Vulkan should make that less of an impact.
-

CW: Do people have an issue with having a fat pipeline state, with all
the pipeline aspects that any of the native APIs has in pipeline state?
-

DJ: I’d like to know what real world impact is.
-

KR: Thinner states allow the application to toggle back and forth.
-

JG: We don’t have a lot of data on this, because we don’t have content.
If any of the ISV/IHVs have information, we’d like to hear it.
-

DM: I think there is some info in the Vulkan specification saying that
two similar states will efficiently compile and swap.
-

CW: It should be cheap to do a check on setPipeline to see if it is
different.
-

DM: I’m not sure I agree that the Vulkan pipeline is bigger than D3D12,
because the render target is embedded in the pipeline.
-

CW: Is the render target view included in the D3D12 pipeline state?
-

DM: Yes, it is.
-

BC: At Microsoft, we had a lot of discussions as to what is in the
pipeline state. We should do a side-by-side analysis here for all APIs. I
was hearing that people want WebGPU to be the aggregate of all things.
-

Elviss: on the Web 2D games are popular and they change blend state a
lot so pipeline state so it would be a lot of pipeline state.
-

Rafael: In D3D12 you specify the format of the render targets not the
RTV itself
-

JG: That’s included in the renderpass that you give on the Vulkan
pipeline state.
-

CM: I want to comment on “if two states are similar, it should be
efficient to swap between them”. I don’t think we can know this for sure.
It might be rendering library and OS dependent.
-

JG: I think we should do a direct comparison between all three APIs, and
note the differences.
-

CM: There are also some pipeline state attributes that are optional in
the Vulkan pipeline state for GPUs e.g. scissoring.

ACTION: JG to write up a comparison of pipeline states between Vulkan,
Metal and D3D12
Command Buffers

CW: Metal has different kinds of command encoders
-

CW: Secondary command buffers in Vulkan
-

JG: Need to compare/contrast secondary command buffers with D3D command
bundles
-

Rafael: Bundle restrictions
<https://msdn.microsoft.com/en-us/library/windows/desktop/dn899205(v=vs.85).aspx#bundle_restrictions>
-

DM: The problem is that there isn’t a single solution that’s better than
others. Vulkan secondary command buffers are powerful but d3d12 bundles are
more efficient. Even if we came up with a model that includes both, we’d
sacrifice some performance.
-

JG: It would be nice to get some examples of what is and isn’t fast
between the two. I expect this might be a micro-optimization that doesn’t
matter too much.
-

MM: So far we haven’t discussed primary command buffers, so I’m not sure
we need to decide on secondary buffer yet. Not necessary for an MVP.
-

BC: I agree with that sentiment. It doesn’t need to be there for
bootstrapping. There are other crucial parts.
-

CW: So we should defer discussion on secondary command buffers?
-

BC: We don’t have the feature-set of the MVP, and it would be more
focused than wondering about secondary command buffer / bundles
-

KR: Secondary Command Buffers were added to Vulkan specifically for
mobile GPU performance. <Mistake: was thinking about render passes>
-

BC: It’s still an interesting point - do passes need to be part of an
MVP for performance?
-

CW: I think that is a big topic that we should discuss later.
-

DM: I think Ken’s point is important. Secondary Command Buffers were
added to Vulkan because of Render Passes. They are related issues.
-

CW: Let’s talk about it when we discuss Render Passes.
-

CW: The biggest API difference is that Metal has multiple encoders. How
expensive is it to swap?
-

MM: If you’re trying to emulate one big pipeline, it’s setting up the
state for all the previous passes. (bad minuting)
-

CW: Vulkan prevents you from …
-

MM: If the semantic model of Metal is the same as the semantic model of
Vulkan, but described in a different way...
-

CW: the biggest difference between Metal and Vulkan seems to be whether
compute and blit can be interleaved
-

Do we need to reset state between compute operations?
-

BC: in D3D12, there’s a queue for graphics, blit, compute
(clarification: direct, bundle, queue, copy)
-

CW: Can’t you have a queue with everything?
-

JG: In Vulkan these are capabilities bitfield on a queue
-

BC: In D3D12 you have a choice of direct, bundle, copy, or compute queue.
-

MM: So since D3D12 and Metal both have this concept of describing the
type of queue, and Vulkan can handle that, I think it should be the model.
-

JG: That is the way I was approaching it in my prototype
-

BC: I will investigate about supporting queues that have different
operations.
-

CW: Vulkan does have the concept of a queue that supports all types of
operations.
-

CW: Should we figure out what exactly happens on all three APIs, and go
for the minimum supported semantic. That way we don’t have to switch too
much on the lowest common denominator.
-

BC: I agree. Having to do lots of work underneath will be too complex.
-

DM: Two different questions - what the encoder provides and what the
queue provides. Given Apple’s approach is more explicit and can provide
more type safety, it sounds like the right approach.
-

BC: Is an “encoder” a “command list” in D3D12?
-

DM: No. It is a context in which you populate your command list.
-

JG: A command factory.
-

MM: you create encoders from command lists to call the operations on.
The encoder records the calls in the command list.
-

BC: In D3D12 you have the command list that does the recording, and then
is closed before it can be sent to the GPU. There is a CommandAllocator
that is basically is a memory allocator.
-

JG: Metal encoders look like an RAII for BeginCommandBuffer and
EndCommandBuffer.
-

CW: What happens when you switch encoders? If it is just a CPU API
convenience then is it cheap to switch them?
-

DJ: Sounds about right.
-

DM: If you do render / blit / render (with the same render target) what
happens?
-

CW: <something about rendercommandencoder being like renderpass for
restrictions>
-

DJ: someone should do an investigation on this, and look at which
commands are allowed where.

ACTION: CW will start an issue about this, and people from different APIs
can fill in the blanks for their system.
Command Queues

CW: How important are async compute and async blits? AMD has reported
+15% perf increases via async compute.
-

JG: keep queues; doing a polyfill on top of Metal is not worse than not
having them.
-

BC: background on why there aren’t multiple queues in Metal? Usually HW
has multiple HW queues. What’s the reasoning behind the decision in Metal
(or am I interpreting it wrong?)
-

DJ: will have to ask
-

BC: Want to make sure we aren’t running against the flow of how Metal is
going to evolve in the future.
-

JG: Is Metal internally or externally synchronized? Is calling
queueSubmit on multiple threads at the same time ok?
-

DJ: Yes
-

JG: In Vk you need to synchronize queues yourself, allows to use them on
multiple threads (and reduce overhead)
-

CM: I’d like to understand the concept of queues between all three APIs.
Are they used differently?
-

BC: In D3D12 you can have a copy only queue that doesn’t use the 3D
cores for the copy. You can use the queues concurrently and you have fences
to synchronize them. Everything is done async so fence to synchronization
at a distance. HW might be able to process stuff in parallel.
-

DJ: my guess is that a blit encoder can run stuff in parallel with
compute encoders.
-

BC: In D3D12, fences are done on queues not command lists
-

CW: same in Vulkan and Metal.
-

JG: In Metal you get a callback.
-

CW/DC: Can’t synchronize multiple queues in Metal.

ACTION: JG to write up a discussion on the queues between the three APIs.
Agenda for next meeting

DJ: Agenda for next meeting should include talking about the
investigation that were made.
-

DJ: W3C has a technical meetup (F2F, next to SFO), should we meet?
WebAssembly is there would be interesting to meet.
-

DM: We are already in the agenda
-

DJ: Will send details to the ML.
-

CW: also for next meeting’s agenda, if we have more time after the
retrospectives, we could talk about renderpasses.

Received on Thursday, 8 June 2017 13:44:52 UTC