Minutes for the 2017-10-04 meeting from Corentin Wallez on 2017-10-06 (public-gpu@w3.org from October 2017)

From: Corentin Wallez <cwallez@google.com>
Date: Fri, 6 Oct 2017 14:32:11 -0400
To: public-gpu <public-gpu@w3.org>
Message-ID: <CAGdfWNOG4Z6uEdTOLWFmG8c-owVxSSsd-HpJ3P_U6JN-7wounw@mail.gmail.com>
GPU Web 2017-10-04

Chair: Corentin

Scribe: Ken and Dean

Location: Google Hangout
Minutes from last meeting
<https://docs.google.com/document/d/1VridLAmC05h80_d-FGmwyI7On0_AY5y8pVGI-TT4ysQ/edit>
TL;DR

   -

   Status updates


   -

      Microsoft will be able to contribute HLSL format and dxc repo to W3C.
      WSL would be a “secure” version of HLSL with generics and pointers.
      -

      Apple: start of WSL -> SPIR-V translator work.
      -

      Google: Investigating SPIR-V robutness, will send doc shortly.
      -

      Mozilla: In gfx-rs found how to emulate D3D12 resource heap
      limitations on Vulkan. Progress all around.
      -

   Indirect draw / dispatch https://github.com/gpuweb/gpuweb/issues/31
   -

      Consensus to only draw/dispatch as indirect commands
      -

      Only concern is about security. Discussion of ways to make these
      commands “robust” to going out of index and vertex buffers bounds.
      -

      Multiple ways to implement securely when there is no robust buffer
      access, and no structural issues.
      -

      Consensus: Leave out of the MVP
      -

      AI: Apple to investigate robustness guarantees in Metal.
      -

      Concern about GPU-modified index buffers and robustness.
      -

   Explicit depth tests vs. derived from depthFunc = always
   -

      Consensus to have at least “depthFunc” and “depthWrite”.
      -

      Rest of the discussion to happen on a Github Issue.
      -

   Need to explicitly describe our security constraints.
   -

      The Webassembly Security doc
      <https://github.com/WebAssembly/design/blob/master/Security.md> could
      be a good reference.

Tentative agenda

   -

   Administrative stuff (if any)


   -

   Individual design and prototype status


   -

   Indirect commands
   -

   Explicit depth test or implicit if depthFunc always (and similar cases)
   -

   Use-cases for synchronization
   -

   Agenda for next meeting

Attendance

   -

   Apple
   -

   Dean Jackson
   -

   JF Bastien
   -

   Myles C. Maxfield
   -

   Google
   -

      Corentin Wallez
      -

      John Kessenich
      -

      Kai Ninomiya
      -

      Ken Russell
      -

   Microsoft
   -

      Chas Boyd
      -

      Rafael Cintron
      -

   Mozilla
   -

      Dzmitry Malyshau
      -

      Jeff Gilbert
      -

   Yandex
   -

      Kirill Dmitrenko
      -

   ZSpace
   -

      Doug Twilleager
      -

   Elviss Strazdiņš

Administrative items

   -

   Brad Nelson (WebAssembly CG chair): extended invitation to talk with them
   -

      Present what we’re doing, figure out how to make things go fast
      -

   CW: Google’s lawyers say we’re close to agreeing on things with other
   companies for the CG’s CLA
   -

   DJ: Want a way to link against an external library

Individual design and prototype status

   -

   Microsoft
   -

      CB: clarified with management that can officially contribute HLSL to
      W3C!
      -

         JG: HLSL as the format?
         -

         CB: Yes, and the repo itself
         -

      DJ: got a hint this was going to happen. Looking to change WSL’s
      syntax to be compatible with HLSL. Idea: most HLSL shaders would compile
      through WSL runtime. Would also have extra features that WSL provides.
      -

      CB: could be a compile flag on HLSL that would secure it for the
      web-based model. Existing HLSL, minus anything not useful for web, plus
      secure pointers / generics that Apple’s been proposing.
      -

      MM: would be “secure HLSL”, slightly modified HLSL.
      -

      CB: Shader model including security. Assures that on Windows there’s
      one shader compiler that’s compliant.
      -

   Apple
   -

      MM: updating JavaScript language implementation to look more like HLSL
      -

      Also starting on SPIR-V codegen phase
      -

      So far have written an assembler and are starting on the codegen phase
      -

   Google
   -

      CW: some investigations on SPIR-V robustness
      -

      dneto@ has added a pass which adds run-time clamping to buffer
      accesses. CW wrote tests in NXT. Need to try turning on the new
robustness
      pass and ensure the tests pass.
      -

      Almost done writing a document describing what’s needed to make
      SPIR-V robust. Only one place that needs fat pointers. The rest
are simple.
      Will be nice.
      -

      Asked Khronos what it would take to use SPIR-V in a non-Khronos spec.
      Will talk about it later. Trying to figure it out.
      -

   Mozilla
   -

      DM: started integrating SPIR-V cross; looked into memory allocation
      portability (it is related to resource heap types); figured out a way to
      emulate D3D12 resource heap limitations on Vulkan API, and are doing this
      in gfx-rs
      -

         Looked into subpass dependencies on D3D12
         -

         Have scheme figured out which correctly orders subpasses and
         writes D3D12 transition barriers based on what’s needed at what subpass
         -

         Mostly take care of resource states now
         -

         Looking at ways to handle other kinds of global barriers
         -

         Made lot of progress on Metal backend. On par with other backends.
         All three next gen APIs running on gfx-rs.
         -

      DJ: q: regarding the analysis of which transition barriers to insert:
      what information are you going on?
      -

      DM: frontend is basically Vulkan renderpass description. Know for
      each subpass which are the input attachments and which of those
need to be
      preserved for subpass. Use dependencies to build the order of subpasses;
      D3D needs this. Use layout attachment descriptions (Attachment A needs to
      be in this layout).
      -

         Split barrier that ends here and starts somewhere else.

Indirect draw/dispatch commands

Investigation: https://github.com/gpuweb/gpuweb/issues/31


   -

   CW: MM pointed out this is DrawIndirect, DrawIndexedIndirect,
   DispatchIndirect
   -

      Sizes of dispatch taken from GPU memory
      -

      Analysis: format of the buffers is exactly the same (!) in D3D,
      Vulkan and Metal
      -

      Capabilities of Metal and Vulkan are about the same
      -

      D3D is a lot more flexible
      -

      MM pointed out that having robust behavior with indirect draws is
      kind of tricky
      -

      In light of all of this, what do we want to do with indirect
      commands? Add to MVP? V1?
      -

   MM: when we say indirect what do we mean?
   -

      At least 3 possible meanings
      -

      1. Being able to change the set of visible resources from GPU code
      (Vulkan doesn’t let you do this. Metal V1, not either. Apple wants to
      target Metal V1)
      -

      2. Allowing resources to hold other resources; like a buffer holding
      a reference to a texture. CPU is the thing constructing these
linked lists
      of objects. Something Vulkan doesn’t let you do. (Think only Metal2 lets
      you do this; D3D doesn’t have pointers to descriptors) Should
get consensus
      that we’re not including this functionality in WebGPU.
      -

      3. Arguments to draw commands (starting vertex, length, …) come from
      buffers rather than CPU side.
      -

      Let’s agree that we don’t do the first two. Then discuss the third.
      -

      CW: agreed.
      -

      DM: the (2) one fits perfectly as an implementation detail for the
      implementation of descriptor sets on Metal 2
      -

      DM: given constraints on verifying vertices we’ve already uncovered,
      we shouldn’t discuss (1) either.
      -

      So in agreement.
      -

   MM: there’s some motivation for (3) like particle systems where
   particles live and die on the GPU.
   -

      Let’s not require a robustness extension
      -

      One option: don’t have the feature
      -

      If we do have the feature, it should be done in a way that a
      robustness extension isn’t required
      -

      KR: points out that particle systems can be written in a way to
      dynamically spawn on the GPU without needing feature (1). WebGL 2.0
      transform feedback particle system example already does this.
      -

      MM: So I think if we all agree that the security issue is too
      complicated at the moment, then we can leave 3 out of the MVP
      -

      KD: could be useful in the future for complex occlusion queries, …
      -

      JG: super easy on robust implementations
      -

      CW: if you don’t have a robust implementation then for every single
      shader you have to do vertex attribute clamping in some way
      -

      JG: no, rather have to do index buffer validation
      -

      CW: D3D12 has a robust vertex input pusher. Most Vulkan hardware has
      robust vertex input too. It’s a hardware feature and not
something compiled
      in to the shader. Would be surprised if Metal doesn’t have this.
      -

      DJ: not sure. Metal’s debug/validation mode might catch this, but
      production mode probably wouldn’t.
      -

      JG: and that’s safety not validation? Running it in production mode
      would crash?
      -

      DJ: all Vulkan hardware has this?
      -

      CW: except Adreno..
      -

      JG: Vulkan robust buffer access extension is required in Vulkan 1.0
      -

      DJ: how can an extension be required?
      -

      JG: support for it is required. Activating it is not required.
      -

      CW: would be surprising if Metal hardware doesn’t have it.
      -

      DJ: would be interesting to know what the cost is of enabling the
      extension.
      -

      MM: questions about aligning allocations on page boundaries
      -

      CW: for this, need robust access for vertex and index data
      -

      KR: this is part of the fixed function pipeline. We don’t have
      control over it and can’t clamp those vertex indices.
      -

      DJ: so should we exclude this from the MVP?
      -

      CW: if Metal supports this sort of robust vertex fetch then it would
      be nice to include in the MVP
      -

      DJ / MM: sounds like concerns are over security
      -

      CW: in the issue there were 3 solutions toward handling robustness:
      -

         1) round-trip through CPU - undesirable
         -

         2) compute shader to clamp arguments outside of render pass (can’t
         transition stuff inside subpasses)
         -

         3) do some sort of clamping inside the vertex shader
         -

         These solutions work for DrawArrays but not DrawElements
         -

         Can’t know what the indices are in the buffer on the GPU
         -

         DrawElementsIndirect: won’t know how big the buffers are which are
         supposed to clamp the indices
         -

      MM: webgl has this validation (DrawElements)
      -

         KR: but we can’t generate indices on the GPU
         -

         MM: The compute shader can walk the index buffer and perform the
         same validation that would be done on the CPU.
         -

         DJ: When you hit a drawIndirect, we know which point in the
         indirect buffer we’ll use, and which index buffer. So you can inject a
         compute shader before the subpass that does the validation.
         -

         MM: And inject this compute shader *before* the rendering pass.
         -

         CW: Sounds heavy, but it might work.
         -

         MM: It is heavy, and the point of this feature is for performance.
         So it might be slower than roundtripping via the CPU. I’m
saying that the
         indirect draw isn’t providing new graphics features, it is just a
         performance gain. Some hardware might not need the heavy
solution. If we
         can agree that all the platform APIs have some facility for
fixed-function
         security, then we can probably go ahead. On the other hand,
if we all need
         to implement the heavy solution, then it might not be worth including.
         -

         MM: We also need to design in a way that can be implemented either
         with supporting hardware or the heavy solution
         -

         CW: That’s fair. And all the solutions we’ve had for
         robust-buffer-like things have not required changes to the API.
         -

         MM: It’s about what the consistent behaviour is. Can the
         implementations produce the same result on all hardware, and
against the
         heavy solution?
         -

         DJ: Or maybe if you could detect that it went wrong…
         -

         JG: We should aim for repeatability but sometimes it might be
         onerous. In which case it would be nice to have a separate
validation mode
         that can tell you if the operation would have succeeded.
         -

         CW: Similar to robust buffer access. Sometimes they can clamp,
         sometimes they can return 0-filled data. So it is hard to guarantee
         repeatable results.
         -

         DM: We should concentrate on the good cases being repeatable, and
         not necessarily the bad cases.
         -

         JG: So it sounds like we need to wait to see if Metal has robust
         vertex fetch. And if it does, we might be able to go ahead.
Otherwise, we
         can discuss.
         -

         JG: I suggest kicking indirect draws out of the MVP for now.
         -

         MM: At the F2F we left indirect in the MVP because we were
         concerned it would shape the rest of the API. But now that
we’ve looked at
         the feature, it wouldn’t change the API. So, not including it
is probably
         ok because we know what it would look like.
         -

         CW: Proposal: we leave it out of the MVP for now. Apple still
         investigates if robust vertex fetch is present.


GPU Modified index buffers

   -

   KD: What about GPU-modified index buffers? Would we validate those
   similarly to the proposal for robust indirect draws?
   -

   CW: Yeah, we’ll need to figure that out, but we might wait for input
   from Metal with respect to robust vertex fetch.
   -

   KD: questions about debug-only validation
   -

   CW: for MVP / V1: could be difficult to tackle because native APIs don’t
   have support for it. Need to think about it later.
   -

   MM: what happens in SPIR-V when dereferencing null pointers?
   -

   JG: no such thing as a null pointer

Explicit depth test or implicit if depthFunc always (and similar cases)

>From the roadmap:

   -

   Open question: Should the API make a distinction between having a depth
   test which always passes and disabling the depth test?
   -

   Open question: Should there be an extra bool to disable independent
   blending, or should it be implicit from the blend attachments?


   -

   CW: depth compare function ALWAYS == not having a depth test?
   -

      Should we have an exclusive boolean saying depth test or not?


   -

      JG: if we can uniquely infer it then we should do so
      -

      CW: Ben Constable was concerned that an extension might add a
      difference between ALWAYS and depth test disabled
      -

      MM: then that extension would add a new enum
      -

      JG: think we can cross that bridge if/when we change the semantics
      -

      KD: doesn’t disabling the depth test disable depth writes?
      -

      CW: in explicit APIs there’s a separate “depth write enabled /
      disabled”
      -

   MM: so we do have a consensus to have a separate boolean for depth
   writes?
   -

      CW: yes
      -

   RC: so how do we infer this on the various APIs?
   -

      CW: there’s a depth write boolean and a depth compare function.
      optionally a depth test boolean. metal doesn’t have the depth
test boolean.
      -

      consensus: depth write boolean. depth test function. *no* depth test
      boolean; inferred.
      -

      JG: writes are disabled if the depth test is off.
      -

      CW: there’s redundancy in these APIs.
      -

   CW: think we can get away with having one boolean and one function.
   -

   KD: for transparency we need depth writes disabled, depth test on, depth
   function LESS (?)
   -

   AI: Let’s put this on the github issue

Use-cases for synchronization

Thread: https://lists.w3.org/Archives/Public/public-gpu/2017Oct/0000.html


   -

   CW: don’t have time to go through the whole discussion now
   -

   AI: look at this and plan to discuss it next time


   -

   MM: still in the process of trying to make Vulkan code that does
   something useful. But trying to work on automatic generation of
   synchronization
   -

   AI: please weigh in on the use cases on that thread

Roadmap

   -

   MM: updated the roadmap with consensus and open questions from the F2F
   -

      Please feel free to update with any corrections

Agenda for next meeting

   -

   Shading languages
   -

      DJ: document about securing SPIR-V will be ready for review by then?
      -

      CW: hopefully yes. Just want to revise one more time.
      -

   MM: one thing that will come out of this discussion is what it means to
   be secure. Should write that down concretely.
   -

      JFB: WebAssembly has a definition of this:
      https://github.com/WebAssembly/design/blob/master/Security.md
      -

      JG: would be good to have as a reference
      -

      CW: would be good to take a look. But since WebAssembly is on the CPU
      and WebGPU is on the GPU the security/robustness primitives will be
      different
      -

      JFB: concerns by users about security so we have to speak their
      language
Received on Friday, 6 October 2017 18:32:58 UTC