Minutes for the 2017-06-21 meeting from Corentin Wallez on 2017-06-22 (public-gpu@w3.org from June 2017)

From: Corentin Wallez <cwallez@google.com>
Date: Thu, 22 Jun 2017 16:06:53 -0400
To: public-gpu@w3.org
Message-ID: <CAGdfWNOo5LGHyd2OWJhZRYnoz9Su1FbdJVd84qjHdmoYTDG6qw@mail.gmail.com>
GPU Web 2017-06-21



Chair: Corentin and Dean

Scribe: Dean with some help

Location: Google Hangout
Minutes from last meeting
<https://drive.google.com/open?id=16YkGd78Ds0GGcEN3R9-sdNZG3C957FukIsMLVBTO2FM>Tentative
agenda

   -

   Administrative stuff (if any)
   -

      F2F following discussion on the ML.


   -

   Individual design and prototype status
   -

   Look at investigations that were made


   -

      Pipeline state
      -

      Queues
      -

   Things we didn’t talk about last meeting
   -

      Renderpasses / Render targets
      -

   Agenda for next meeting

Attendance

   -

   Mikael Sevenier (AMD)
   -

   Chris Marrin (Apple)
   -

   Dean Jackson (Apple)
   -

   Jason Aftosmis (Apple)
   -

   Julien Chaintron (Apple)
   -

   Myles C. Maxfield (Apple)
   -

   Theresa O'Connor (Apple)
   -

   Corentin Wallez (Google)
   -

   Kai Ninomiya (Google)
   -

   Ken Russell (Google)
   -

   Zhenyao Mo (Google)
   -

   Ben Constable (Microsoft)
   -

   Chas Boyd (Microsoft)
   -

   Rafael Cintron (Microsoft)
   -

   Dzmitry Malyshau (Mozilla)
   -

   Jeff Gilbert (Mozilla)
   -

   Doug Twilleager (ZSpace)
   -

   Elviss Strazdiņš
   -

   Joshua Groves

Administrative items

   -

   CW: Administrative stuff for F2F. Two options: next to Chicago or near
   Bay Area.
   -

   People in the room seem mostly ok with both Chicago and Bay area, slight
   preference for Bay Area. Please respond on the mailing list with your
   preference.
   -

   CW: If we meet in Chicago, it has to be mid-September. Otherwise we can
   do it anytime in the Bay Area.
   -

   CW: Does anyone have alternative suggestions?
   -

   CW: How to contact the W3C to get on the TPAC schedule
   -

   DJ: I will take that action.



AI: Corentin will clarify the F2F mail

AI: Dean will contact for TPAC schedule
Individual design and prototype status

   -

   CW: Austin has made progress on the D3D12 prototype. We have GPU
   readback working.
   -

   DM: I’m also working on Mozilla’s D3D12 prototype.
   -

   DM: Mozilla’s prototype will be based on GFX-Rust’s low level interface
   -

   DJ: No update from Apple
   -

   RC: No update from Microsoft

Queues (and philosophy)

   -

   CW: Jeff provided an investigation, with lots of details on
   synchronization.
   -

   JG: Queues in Metal can accept all three command encoder types.
   -

   JG: We can use the D3D/Vulkan style of Queue, even on Metal. I had to
   dive into the synchronisation and command buffer details.
   -

   JG: Metal encoders push into Queues, that can accept any type of command
   buffer.
   -

   JG: propose to do something similar to D3D12 / Vulkan for the queue
   families. Synchronization is where things differ, D3D12 has 64bit fence
   -

   CW: Metal uses callbacks for synchronization, so you can use 64bit.
   We’ve found 64bit fences quite useful. In Vulkan, it looks straightforward
   to be able to support 64bit, by using the boolean flag and ordering.
   -

   JG: Could you explain that in the github issue? It sounds easier than I
   think it really is.
   -

   DM: It sounds like you’d have multiple VkFence objects.
   -

   JG: In D3D12 has a single fence, that can be signaled with different
   values. You’d need multiple Vulkan semaphores to get this behaviour.
   -

   CW: That’s true. The driver in D3D is creating separate objects in order
   to do this.
   -

   JG: My concern is that creating a fence is a sync operation. Creating an
   arbitrary number of objects on the fly doesn’t sound trivial.
   -

   CW: It’s creating a Vulkan semaphore per operation. I’ll update the
   investigation with my findings.
   -

   MM: Do any shipping Vulkan apps create more than a single boolean
   semaphore? If not, maybe a boolean is enough. If Vulkan doesn’t provide it,
   and apps don’t use it, I’m not sure we need a 64bit flag.
   -

   CW: Not sure.
   -

   MM: I’d prefer to get real-world feedback before deciding on an
   engineering solution.
   -

   CW: The Chrome team doesn’t have a close relationship with Vulkan
   developers. We can try.
   -

   JG: Mozilla can reach out to some engine developers.
   -

   RC: Can ask on Microsoft’s side why D3D12 has 64bit fences.
   -

   DJ: We’ll also follow up via Apple.
   -

   JG: The other thing worth mentioning about Queues, is that Vulkan
   imposes a fixed number of queues at device creation time. D3D12 and Metal
   create queues at will. Vulkan is the most constrained.
   -

   CW: I would expect creating one queue of each type is enough.
   -

   DM: I think if you have a scheduler you might want multiple compute
   queues.
   -

   DM: Isn’t clear if we can D3D12-like guarantees on which queues support
   what.
   -

   CW: Vulkan queues that support graphics also do compute and blit.
   -

   CW: We just need to look at all the hardware and see what VKInfo we get
   for them.
   -

   DM: What if hardware comes out in a year that only supports graphics?
   -

   CW: Then the implementation would only expose the queue that supports
   all.
   -

   DM: I don’t see why we wouldn’t use the Vulkan model with flags for
   queue functionality.
   -

   CW: It’s an added burden on the translation to non-Vulkan APIs.
   -

   JG: I don’t think it adds much overhead.
   -

   DM: There are no complications that I can see.
   -

   JG: Metal has a single queue type that can handle anything. D3D12 queues
   have different types.
   -

   MM: If hardware comes out that only supports graphics, then it can’t run
   Vulkan.
   -

   DM: more saying that there might be one of the family queues that only
   supports graphics queue.
   -

   DM: There must be a reason the Vulkan specification was written this
   way. We should ask.
   -

   CW: From our point of view, the application should be able to use a set
   of WebGPU that can run in all places. Having to deal with the variability
   of queue families is going to cause problems. People will not code to
   handle the variability.
   -

   JG: It sounds like your main concern is that content might write to
   something that has a graphics only queue, and then break when moving to a
   backend with support for more.
   -

   RC: When would it ever make sense to have a graphics-only queue? D3D12
   won’t work.
   -

   MM: Metal is the same.
   -

   JG: Metal distinguishes by encoders, not queues. It is agnostic.
   -

   MM: If a device only supports graphics, then it can’t run Metal. Or
   anything.
   -

   JG: Vulkan must expose a graphics/blit/compute family. But it could
   possibly expose something that was graphics only.
   -

   MM: Talking about hardware that doesn’t exist and might never exist.
   -

   KR: All of the APIs have a superset queue. Why not just keep it simple?
   -

   JG: I want it to be correct, not simple.
   -

   CW: But you’ll force the implementations to write code to force queues
   to only handle a subset. There doesn’t seem to be any benefit to this.
   -

   DM: Where do you get the idea there is no hardware existing that
   supports graphics-only?
   -

   DM: There are lots of mobile hardware that might have this set up.
   -

   JG: The Vulkan semantics are inclusive.
   -

   CM: What’s the downside of not exposing multiple queues
   -

   DJ: Could it be added later?
   -

   JG: It would change how queues are created.
   -

   DJ: <sorry missed most of it> Seems like a minor topic and change in
   application logic.
   -

   BC: Isn’t Vulkan supposed to expose one queue that does everything
   -

   BC: Talking about hypothetical that a graphics-only queue family exists
   in addition to the “can do everything” family.
   -

   JG: My concern is that is almost no change between exposing a
   graphics-only queue and an “everything” queue that is limited to graphics.
   -

   BC: The combinatorial cost of verifying your code on all these setups is
   too expensive. This was a design decision we made in D3D12. There is a huge
   cost to supporting this variation. There is no evidence for a performance
   benefit.
   -

   JG: Why do D3D12 and Vulkan support non-universal queues.
   -

   CW: Because async compute is valuable.
   -

   JG: So queues that don’t support graphics are valuable?
   -

   CW: Options: universal queue, the ability to detect and create queues,
   or one queue for each type of operation (graphics is universal)
   -

   CW: We should not try to expose the complexity of native APIs if there
   is a limited benefit, and it makes content and implementations more complex.
   -

   CM: The first one is the most simple thing we could do. Why not start
   with that?
   -

   JG: How much change are we willing to change between MVP and the final
   product?
   -

   BC: we can tolerate a loarge amount of change between MVP and version
   1.0. Much less between 1 and 1.1(?).
   -

   DJ: Propose we stick with the simplest solution for now and Mozilla
   investigates on the usefuleness of non-universal queues.
   -

   CW: When the MVP is out, developers might complain and we can make the
   change.
   -

   JG: I would prefer to do it right from the start.
   -

   CM: Then we’ll have this discussion on every issue.
   -

   BC: Having a working thing allows prototyping changes and measure the
   impact.
   -

   JG: Feel prototyping and figuring out from there is hard.
   -

   TO: We want the union, not the superset.
   -

   JG: It’s free in this case. I don’t think ease of use should be a
   driving factor in this decision.
   -

   CW: <something about not being free for consistency and spec complexity>
   -

   BC: We’re talking about consistency. That’s different from ease of use.
   These APIs will not be easy to use for most people. We’re not talking about
   cognitive load but how much testing you need to do on multiple platforms.
   -

   CM: in order to release an API we’ll need a test suite, the smaller the
   API, the easiest to make a test suite.
   -

   CW: I’m not sure how to resolve. Jeff, if you have strong arguments
   where your approach gives better performance, then please bring them up.
   -

   JG: philosophically concerned about our discarding performance details
   in the sack of simplicity.


   -

   BC: What would compel me would be pointing to a lot of phones that
   expose the q type that you are talking about, and that type provides a
   significant performance benefit. If that could be shown, I’d agree with you.
   -

   JG: Apps targeting Vulkan have to do that already.
   -

   BC: We’re not making a Vulkan layer for everyone, we’re exposing a
   graphics API for everyone.
   -

   JG: That’s fair. I’ll go ask. I’m worried that we are second guessing
   the Vulkan group.
   -

   BC: What about 64bit fence vs. boolean fence. D3D12 also have smart
   people working on the API.
   -

   JG: Point taken.
   -

   CBoyd: In term of adoption it will be much easier to start using a
   simple 1.0 and then adopt more complexity as it gets introduced in follow
   up versions.



AI: Corentin will update the investigation with boolean fences to 64bit
fences
Encoder types

   -

   CW: Last time we talked about this we decided a particular way, but have
   since got some feedback from Apple…
   -

   DJ: Yes, our Metal team said there is possibly a significant cost to
   swapping encoder types. We recommend against this.
   -

   CW: Let’s talk more in the github issue.

Agenda for next meeting

   -

   CW: what to talk about next meeting.
   -

   JG: pipeline state.
   -

   CW: more of the pending subject like F2F and queues (and encoders)
   -

   CW: will start investigation on Render Passes. It is a core part of the
   API, different in all three native APIs.
   -

   JG: Shader language eventually.
   -

   CW: We’ll advertise this in advance, since there are people at Google
   that want to advocate for SPIR-V.
   -

   MM: I’ve started looking at synchronization primitives.
Received on Thursday, 22 June 2017 20:07:47 UTC