Re: Pipeline objects open questions from Corentin Wallez on 2017-09-12 (public-gpu@w3.org from September 2017)

From: Corentin Wallez <cwallez@google.com>
Date: Tue, 12 Sep 2017 16:09:49 -0400
To: "Myles C. Maxfield" <mmaxfield@apple.com>
Cc: public-gpu <public-gpu@w3.org>
Message-ID: <CAGdfWNP7WPpeQkZAadEBgkHGtwi0MOReK5tjUUPGP7c7bJ--Yg@mail.gmail.com>
On Mon, Sep 11, 2017 at 11:44 PM, Myles C. Maxfield <mmaxfield@apple.com>
wrote:

>
>
> On Sep 8, 2017, at 12:24 PM, Corentin Wallez <cwallez@google.com> wrote:
>
> Hey all,
>
> While what goes into pipeline objects is mostly clear (see this doc
> <https://github.com/gpuweb/gpuweb/blob/master/design/Pipelines.md>),
> there is still a bunch of open questions:
>
>    - How do we take advantage of the pipeline caching present in D3D12
>    and Vulkan? Do we expose it to the application or is it done magically in
>    the WebGPU implementation?
>
> The desire here is valid. Metal doesn’t have any concept of a pipeline
> cache, but it does have an intermediate form that shaders can be in (i.e.
> if you create an Xcode project and add a .metal file to it, Xcode will
> compile to this at build-time). Caching these would speed up the
> initialization of a WebGPU webapp.
>
> Would there be a way for browsers to use these tools at runtime? The Metal
API doesn't have a way to get a binary representation of the MTLLibraries.


> However, both D3D12 and Vulkan expose this by letting the application
> access the raw bytes of the cache (presumably so the application can
> serialize them to disk). If these objects are going to hold machine code
> for compiled shaders, letting them round-trip through arbitrary JavaScript
> would be unacceptable from a security perspective. Instead, the only way
> this could work is if the webapp was delivered an opaque handle (or cookie)
> which had no intrinsic meaning, but was used as a key in an internal map
> the browser maintains (and the browser prunes at implementation-defined
> times with implementation-dependent pruning criteria).
>
> I can't find a proper reference to this, but my understanding it that
IndexedDB allows storing opaque objects and retrieving them. Vulkan and
D3D12's data (and Metal's if possible) could be stored in it securely. The
opaque object could also contain data necessary to invalidate the cache,
like a driver version, WebGPU library version etc.


> But if the browser is going to hold on to these objects internally anyway,
> there isn’t really any value in going through the webapp at all. Instead of
> making the webapp author write new code to get performance, we should just
> make fast performance the default. These pipeline caches would work best if
> the browser internally always used them for all WebGPU apps.
>

I agree with this: D3D12, Metal and Vulkan have different caching mechanism
and the WebGPU implementation would be able to make the best use of it,
compared to web apps. I'm thinking for example of a Vulkan backend that
could have a pipeline derivative per (vs, fs) combination, while in D3D12
it would re-use D3D12_SHADER_BYTECODE and only use the cache if the
pipelines match exactly.

With regard to your test with strip cut index, I think it is dangerous to
> rely on a non-specced behavior like that.


The problem is the Metal always enables primitive restart, so to have
consistent behavior we need it enabled all the time on D3D12 and Vulkan
too. This way we won't have application using the 0xFFFF index in 16bit and
have it produce different results on different platforms. If there was a
masking mechanism like Dzmitry mentioned, it would be best as we wouldn't
have to encode the index type in the pipeline. Not addressing primitive
restart for the MVP sounds fine, but it will be important for a 1.0.

> Should the vertex attributes somehow be included in the PipelineLayout so
> vertex buffers are treated as other resources and changed in bulk with them?
>


I don't think we should try to innovate here as opposed to just providing
> what D3D12/Vulkan/Metal have (and they don't have vertex attributes in the
> layout/signature).


Understood. I found this while developing NXT I was thinking it would help
reduce the number of code paths and slightly reduce the CPU overhead of
WebGPU.

There are multiple kinds of sample counts. There is a sample count of an
> image, defining the actual storage properties. The framebuffer (containing
> images) then implicitly carries that property (sample count of the storage).
> There is a sample count of the rasterizer, defining the fixed function
> state, and like all the other fixed function state it should be in the
> pipeline descriptor/state. So my answer would be "no".
>

What happens when the sample count of the rasterizer is different from the
sample count of the attachments? If some cases are valid I agree we should
have it specified separately in the pipeline state.

> It sounds like you’re asking for us to choose between two cases:
>
> I see the question caused some confusion. Depth bounds test != depth test.
> I voted for both to be explicit, potentially by using WebIDL optional
> dictionary entries/default values semantics.


Yeah some bullet contained two unrelated questions. I think the first one
was about Metal not having "depth bounds" test as far as I understand.

The second part is about how disabling depth test being the same as using
the "always" test function. Do we want to have redundant information that
would produce the same thing with (depthTestEnable = true, depthCompare =
always) and (depthTestEnable = false, depthCompare = whatever). Or do we
just want to have depthCompare?

If I understand correctly everyone agrees we should have an explicit
"independentBlend". If it is set to false, then blend[0] should be used for
all attachments?
Received on Tuesday, 12 September 2017 20:10:34 UTC