Re: Pipeline objects open questions from Myles C. Maxfield on 2017-09-12 (public-gpu@w3.org from September 2017)

From: Myles C. Maxfield <mmaxfield@apple.com>
Date: Tue, 12 Sep 2017 13:43:19 -0700
To: Corentin Wallez <cwallez@google.com>
Cc: public-gpu <public-gpu@w3.org>
Message-id: <613BDBB0-33D9-4477-BFAC-556EA65C4C9C@apple.com>
> On Sep 12, 2017, at 1:09 PM, Corentin Wallez <cwallez@google.com> wrote:
> 
> On Mon, Sep 11, 2017 at 11:44 PM, Myles C. Maxfield <mmaxfield@apple.com <mailto:mmaxfield@apple.com>> wrote:
> 
> 
> On Sep 8, 2017, at 12:24 PM, Corentin Wallez <cwallez@google.com <mailto:cwallez@google.com>> wrote:
> 
>> Hey all,
>> 
>> While what goes into pipeline objects is mostly clear (see this doc <https://github.com/gpuweb/gpuweb/blob/master/design/Pipelines.md>), there is still a bunch of open questions:
>> How do we take advantage of the pipeline caching present in D3D12 and Vulkan? Do we expose it to the application or is it done magically in the WebGPU implementation?
> 
> The desire here is valid. Metal doesn’t have any concept of a pipeline cache, but it does have an intermediate form that shaders can be in (i.e. if you create an Xcode project and add a .metal file to it, Xcode will compile to this at build-time). Caching these would speed up the initialization of a WebGPU webapp.
> 
> Would there be a way for browsers to use these tools at runtime? The Metal API doesn't have a way to get a binary representation of the MTLLibraries.

The Xcode SDK includes /usr/bin/metal and /usr/bin/metallib.

>  
> However, both D3D12 and Vulkan expose this by letting the application access the raw bytes of the cache (presumably so the application can serialize them to disk). If these objects are going to hold machine code for compiled shaders, letting them round-trip through arbitrary JavaScript would be unacceptable from a security perspective. Instead, the only way this could work is if the webapp was delivered an opaque handle (or cookie) which had no intrinsic meaning, but was used as a key in an internal map the browser maintains (and the browser prunes at implementation-defined times with implementation-dependent pruning criteria).
> 
> I can't find a proper reference to this, but my understanding it that IndexedDB allows storing opaque objects and retrieving them. Vulkan and D3D12's data (and Metal's if possible) could be stored in it securely. The opaque object could also contain data necessary to invalidate the cache, like a driver version, WebGPU library version etc.

See Fil’s answer above.

>  
> But if the browser is going to hold on to these objects internally anyway, there isn’t really any value in going through the webapp at all. Instead of making the webapp author write new code to get performance, we should just make fast performance the default. These pipeline caches would work best if the browser internally always used them for all WebGPU apps.
> 
> I agree with this: D3D12, Metal and Vulkan have different caching mechanism and the WebGPU implementation would be able to make the best use of it, compared to web apps. I'm thinking for example of a Vulkan backend that could have a pipeline derivative per (vs, fs) combination, while in D3D12 it would re-use D3D12_SHADER_BYTECODE and only use the cache if the pipelines match exactly.
> 
> With regard to your test with strip cut index, I think it is dangerous to rely on a non-specced behavior like that.
> 
> The problem is the Metal always enables primitive restart, so to have consistent behavior we need it enabled all the time on D3D12 and Vulkan too. This way we won't have application using the 0xFFFF index in 16bit and have it produce different results on different platforms. If there was a masking mechanism like Dzmitry mentioned, it would be best as we wouldn't have to encode the index type in the pipeline. Not addressing primitive restart for the MVP sounds fine, but it will be important for a 1.0.

Or, for the MVP, we could all just iterate through index buffers and make sure the sentinel value never appears. (And add the functionality back in later.)

> 
> > Should the vertex attributes somehow be included in the PipelineLayout so vertex buffers are treated as other resources and changed in bulk with them?
>  
> I don't think we should try to innovate here as opposed to just providing what D3D12/Vulkan/Metal have (and they don't have vertex attributes in the layout/signature).
> 
> Understood. I found this while developing NXT I was thinking it would help reduce the number of code paths and slightly reduce the CPU overhead of WebGPU.
> 
> There are multiple kinds of sample counts. There is a sample count of an image, defining the actual storage properties. The framebuffer (containing images) then implicitly carries that property (sample count of the storage).
> There is a sample count of the rasterizer, defining the fixed function state, and like all the other fixed function state it should be in the pipeline descriptor/state. So my answer would be "no".
> 
> What happens when the sample count of the rasterizer is different from the sample count of the attachments? If some cases are valid I agree we should have it specified separately in the pipeline state.

Metal disallows this.

> 
> > It sounds like you’re asking for us to choose between two cases:
> 
> I see the question caused some confusion. Depth bounds test != depth test.
> I voted for both to be explicit, potentially by using WebIDL optional dictionary entries/default values semantics.
> 
> Yeah some bullet contained two unrelated questions. I think the first one was about Metal not having "depth bounds" test as far as I understand.

Metal not having a configurable depth bounds test should make our choice for WebGPU quite clear.

> 
> The second part is about how disabling depth test being the same as using the "always" test function. Do we want to have redundant information that would produce the same thing with (depthTestEnable = true, depthCompare = always) and (depthTestEnable = false, depthCompare = whatever). Or do we just want to have depthCompare?

depthCompare seems simpler, but I don’t think it matters very much.

> 
> If I understand correctly everyone agrees we should have an explicit "independentBlend". If it is set to false, then blend[0] should be used for all attachments?

We shouldn't require web app author to specify redundant information.
Received on Tuesday, 12 September 2017 20:43:43 UTC