- From: Dzmitry Malyshau <dmalyshau@mozilla.com>
- Date: Tue, 6 Aug 2019 13:13:06 -0400
- To: Kevin Rogovin <kevinrogovin@invisionapp.com>
- Cc: public-gpu@w3.org
- Message-ID: <b751ecf2-b6c6-5451-a162-b9b9bffa533b@mozilla.com>
Hi Kevin, Thanks for correcting me on the clip distance! Apparently, it's not as much that clip-distance is outdated, as it's my knowledge about it:) The supporting links you have provided are good material for a future investigation issue. Regards, Dzmitry On 8/6/19 12:48 PM, Kevin Rogovin wrote: > Hi, > > Thank you for the fast response. I will file issues separately, but I > will share my thoughts on the reply. > > Firstly, Vulkan does support HW-clip planes, > indeed VkPhysicalDeviceFeatures has fields for both clipping and > culling (shaderClipDistance and shaderCullDistance) along with how > many from the fields maxClipDistances and maxCulldistances > from VkPhysicalDeviceLimits. In addition, Metal also support > clip-distance in its shading language, see 5.2.3.3 Vertex Function > Output Attributes of the Metal 2.2 shading spec, > https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf. > D3D12 also support clip-distance, see the enumeration > D3D12_CLIP_OR_CULL_DISTANCE_COUNT at > https://docs.microsoft.com/en-us/windows/win32/direct3d12/constants . > Also, saying the clip-planes is a thing of the past is quite > unrealistic as there are a significant number of rendering algorithms > that I wish to employ that uses them. I do however advocate for it to > be ok to report that there are no such clip-distance values supported > since it is acceptable in Vulkan to not support clip-distance. Many > GPU's dedicate a non-trivial amount of silicon to implement these > user-defined clip-planes and to not make them available when present > seems far from ideal. Emulating HW-clip planes through compute is > quite icky though (and typically involving atomic-ops in the compute > shader) and emulating it through discard is the worst possible. > > Secondly, on the subject of advanced blend equations, I would rather > that the feature was part of the spec from day 0 with the ability to > query if it was supported. For UI rendering these blend modes prevent > a large amount of terrible poorly performant options. These blend > equations are already available as extensions in Vulkan, see > https://www.khronos.org/registry/vulkan/specs/1.1-extensions/html/vkspec.html#VK_EXT_blend_operation_advanced. > Leaving it for a later-extension is essentially pushing it down > further away with which comes a higher chance it never sees the real > light of day. The only part that is affected really is just adding > those blend modes to the current list along with a query to report if > it is supported. > > Next, issue (6) and (4) are VERY different. Issue (6) gives an > application the ability to read the current value from the framebuffer > at the fragment location, not all hardware support this, but most > tiled-based architectures can or do (indeed a number of them do not > have dedicated blending units and perform blending by adding an > epilogue to do the blending). This feature can be done, with effort, > for Metal on iOS (but not MacOS) but I have not seen an extension for > Vulkan yet. In contrast, (4) is about declaring a value needs to never > be sent back to memory (mostly an optimization for tiled-based > renderers) but one cannot read back any values in the shader, instead > it is for after a "sort-of-render-target-change". Feature (4) is a > just an optimization for tiled-based renderers. > > Item (7), fragment shader interlock is available, on some hardware, on > Vulkan with: VK_EXT_fragment_shader_interlock > > > Lastly, a good GPU application needs to have some understanding of the > GPU: > - is it tiled based renderer or not? > - what optional features are possible? > > For example, on a non-tiled based renderer, reading from the current > render target is nowhere near as heavy operation as it is for a tiled > based renderer. I advocate that exposing these elements will allow > applications to get more performance from the GPU which is much the > reason for WebGPU. I am all for making code portable, but GPU > performance intensive applications (the purpose of WebGPU) needs this > to get that, otherwise the gap between native and web will be quite large. > > At any rate, I will file each of these as separate issues, but I would > like to have a discussion on these on the mailing list (or in the > issues) out in the open. Ideally, we would here input not just from > the implementors point of view, but also the developers point of view. > > Best Regards, > -Kevin Rogovin > > > > > On Tue, Aug 6, 2019 at 6:39 PM Dzmitry Malyshau <dmalyshau@mozilla.com > <mailto:dmalyshau@mozilla.com>> wrote: > > Hi Kevin, > > Thank you for writing down your (employer's) use cases! > > Ideally, these would need to be filed as issues on > https://github.com/gpuweb/gpuweb/issues . > > 1. Needs an investigation to be done (see others - > https://github.com/gpuweb/gpuweb/labels/investigation). Roughly > speaking, this is very useful and IIRC widely supported, we should > have > it in the API. > > 2. User clip planes are the thing of the past, found in none of our > target APIs (Vulkan, D3D12, Metal). Therefore, I don't think this > feature should influence WebGPU spec. > > 3. This appears to only be supported in Vulkan (of the 3 APIs we > target) > and provides only a minor benefit (unless you have numbers to show > otherwise?). Perhaps, this would work as a small extension, but it > doesn't seem necessary for MVP or V1 of the API. > > 4 and 6. These are similar (in a sense that both are addressed by > Vulkan > sub-passes). Finding a good model of the API that would be > portable is > difficult. There needs to be a solid investigation followed by one or > more proposals before we can have this. > > 5. I don't think any of our target APIs support this, so this feature > can't be influencing the WebGPU spec. > > 7. Haven't looked into it. Needs an investigation done. > > > You are welcome to file issues and help us with > investigations/proposals! > > Note that in general we are trying to not have a lot of variation > in the > exposed device "geometry". These extra flags and capabilities make > the > application take different code paths on different platforms, which > hurts the portability property of the API and makes fingerprinting > easier. > > Thank you, > > Dzmitry > > > On 8/6/19 3:55 AM, Kevin Rogovin wrote: > > Hi, > > > > I have a number of feature requests which are quite important > for my > > employer's use cases. > > > > First the easiest ones: > > > > 1. Dual source blending, i.e. add the blend modes: "src1-color", > > "one-minus-src1-color", "src1-alpha", "one-minus-src1-alpha", > > "src1-alpha-saturated". Each of these has a direct analogue in > Vulkan, > > Metal and Direct3D12. > > > > 2. Add Hw-clip-planes where a query states how many hardware > > clip-planes are supported. It is OK if the return value is 0. In > > particular, if the GPU does not support HW-clip planes from its > API, > > it should return 0. I have quite a few cases where knowing if > HW-clip > > planes are available can change my rendering strategy and > improve GPU > > efficiency significantly. Lastly, using discard to emulate HW-clip > > planes can have large, negative performance impact and is > something I > > (and others) should avoid. > > > > 3. Derived pipeline state objects. Not all of the targeted API's > have > > this feature, but those that do, like Vulkan, it can help. The main > > use case is again that if two PSO's are quite similar, then a > driver > > can upload only the parts are different and compute in advance what > > those parts that are different. > > > > Now the tricky ones which require significant thought to > properly do: > > > > 4. Render passes with local storage. This was something that was > > non-trivial in Vulkan I admit but the potential usefulness is > > significant. The basic idea is the ability to declare a value in > the > > frag-shader as intermediate to be read from the exact same pixel > > location in a later rendering pass. The big use case is for tile > based > > renderers so that temporary data is never sent out to memory. This > > gives a large performance and power-saving boost for deferred > > rendering strategies. > > > > And lastly, features that not all GPU's can do, but are game > changers: > > > > 5. To *optionally* support the blend modes of khr-blend-equations > > advanced. I just want the API to have a query to ask if it is there > > and as extensions rollout for Vulkan or ability to emulate with > Metal > > as found in iOS, to use this feature if the GPU supports it. On the > > desktop two of the three major GPU providers have hardware > support for > > this feature. Of the mobile GPU's I think most have this in > their GLES > > implementations. > > > > 6. For tile based renderers, the ability to read the "last" > value of > > the framebuffer at the fragment, something akin to > > GL_EXT_shader_framebuffer_fetch. Again, not to require this > feature, > > but the ability to query it. Most tiled based renderers can support > > this on some level and on the desktop, two of the three can > either do > > or emulate this feature. For a variety of situations, this can be a > > game changer to improve performance as well. On mobile, I know that > > atleast 3 of the GPU lines out there support or can support this > feature. > > > > 7. Another useful feature is an analogue of > > GL_ARB_fragment_shader_interlock; again two of the three desktop > GPU's > > have HW support for this feature. For a variety of situations, this > > can be a game changer to improve performance as well. > > > > I would like to participate in the discussions, not just drop the > > above wish list. I.e. I want to help make any, or all, of the above > > land in WebGPU. > > > > Best Regards, > > -Kevin Rogovin >
Received on Tuesday, 6 August 2019 17:13:33 UTC