Re: Pipeline objects open questions from Ben Constable on 2017-09-13 (public-gpu@w3.org from September 2017)

From: Ben Constable <bencon@microsoft.com>
Date: Wed, 13 Sep 2017 02:26:02 +0000
To: Dzmitry Malyshau <dmalyshau@mozilla.com>
CC: "Myles C. Maxfield" <mmaxfield@apple.com>, Corentin Wallez <cwallez@google.com>, public-gpu <public-gpu@w3.org>
Message-ID: <CY4PR21MB01527C5BB7F65991680515BACA6E0@CY4PR21MB0152.namprd21.prod.outlook.com>
I think that people porting from D3D12 or Vulkan will be specifying the cut value, because they have to. D3D11 had no equivalent of "none" so I think most low level pipeline code is used to dealing with the strip cut value. And anybody doing that will know what format their buffers are.


>From a "size of struct" level of things, we are replacing one enum with another in D3D12 / Vulkan. And by doing so, we can have a single set of code that works on all three APIs.


You are right that people won't always know this ahead of time, but if you look at the pipeline state, it already asks you to fill in the input layout for your vertex buffers and the primitive type. I figure that for any given mesh you are creating, the additional bit of data about the index buffer is probably known at that time. At the time you are saying "My vertex buffer has 3 sets of vector3s" you can also probably know "my index buffer is 16 bit UINTs".


Sent from Outlook<http://aka.ms/weboutlook>

________________________________
From: Dzmitry Malyshau <dmalyshau@mozilla.com>
Sent: Tuesday, September 12, 2017 7:01:59 PM
To: Ben Constable
Cc: Myles C. Maxfield; Corentin Wallez; public-gpu
Subject: Re: Pipeline objects open questions

Ben,

>The test you mention tests for the case you are asking about (using 0xFFFFFFFF with 16 bit indices) and it says that it should not work. Do you mind telling me what your HW / Driver / OS combo is? I am curious how you are seeing this case work.

Thanks for the info!
I'll double check my test and provide a RenderDoc capture with system info.

> Your feedback about the level of documentation combined with the availability of a detailed spec and the HLK source is taken. I will investigate how we can open up the appropriate things here to make it easier for people to inspect things.

Please excuse my sarcastic comment. This is not the best venue to complain about DX12 documentation.

> On this particular point, the design is that the cut value has to match the format to work correctly. I believe that having the index buffer format be part of the pipeline state can solve this problem without having to do a lot of state tracking and will work on all APIs.

This is a little bit unfortunate, given that neither API requires the format of the index buffer, and we just happen to depend on it because of the strip cut value. For instance, it may introduce more headache for future developers trying to port their applications on GPUWeb, since they may not know ahead of time what sort of index buffer is used, or even use different index formats with the same PSO.

Regards,
Dzmitry


On Tue, Sep 12, 2017 at 7:11 PM, Ben Constable <bencon@microsoft.com<mailto:bencon@microsoft.com>> wrote:
The test you mention tests for the case you are asking about (using 0xFFFFFFFF with 16 bit indices) and it says that it should not work. Do you mind telling me what your HW / Driver / OS combo is? I am curious how you are seeing this case work.

Your feedback about the level of documentation combined with the availability of a detailed spec and the HLK source is taken. I will investigate how we can open up the appropriate things here to make it easier for people to inspect things.

On this particular point, the design is that the cut value has to match the format to work correctly. I believe that having the index buffer format be part of the pipeline state can solve this problem without having to do a lot of state tracking and will work on all APIs.

From: Dzmitry Malyshau [mailto:dmalyshau@mozilla.com<mailto:dmalyshau@mozilla.com>]
Sent: Monday, September 11, 2017 6:31 PM
To: Myles C. Maxfield <mmaxfield@apple.com<mailto:mmaxfield@apple.com>>
Cc: Corentin Wallez <cwallez@google.com<mailto:cwallez@google.com>>; public-gpu <public-gpu@w3.org<mailto:public-gpu@w3.org>>
Subject: Re: Pipeline objects open questions

Ben,

> With regard to your test with strip cut index, I think it is dangerous to rely on a non-specced behavior like that.

If only, you know, we had a "specced" behavior defined for D3D12. The MSDN pages are rather scarce on details (https://msdn.microsoft.com/en-us/library/windows/desktop/dn986732(v=vs.85).aspx<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmsdn.microsoft.com%2Fen-us%2Flibrary%2Fwindows%2Fdesktop%2Fdn986732(v%3Dvs.85).aspx&data=02%7C01%7Cbencon%40microsoft.com%7Cd51c41d5b52e432e4b3f08d4f97dfbdc%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636407766875906595&sdata=d9OLmKvlttYvStgOc6YS%2FyG8D%2FltT2656ERRCMY6Ru4%3D&reserved=0>). If I read exactly what it says, then my test case should not have worked.
My understanding is that the best specification of D3D12 is the HLK test suite. Could you check the source of the corresponding test (https://msdn.microsoft.com/en-us/library/windows/hardware/dn942192(v=vs.85).aspx<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmsdn.microsoft.com%2Fen-us%2Flibrary%2Fwindows%2Fhardware%2Fdn942192(v%3Dvs.85).aspx&data=02%7C01%7Cbencon%40microsoft.com%7Cd51c41d5b52e432e4b3f08d4f97dfbdc%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636407766875906595&sdata=IRkuq4taUm9ZcEp%2BNgf7QTDBHY3Excw01N1D3PyvmYE%3D&reserved=0>) to see if the case using 0xFFFFFFFF value with 16-bit buffer is covered?
I'm hoping to see this behavior analogous to stencil tests. One can have a reference value of 0x34 matching the stencil buffer value of 0x04 if the corresponding stencil read mask is 0x0F, for example.

> Are we wanting to allow multiple index types in the MVP (both 16 and 32)? I realize that there are bandwidth limitations etc driving people’s decisions here, but I am skeptical that this design point is something that gives us lock-in if we just choose 32 bit for the time being.

It's a bit important since it affects the pipeline creation API. Surely we can (and will) break the API after MVP, but we better have strong reasons for it.

Myles,

> Also, I’m not sure how Dzmitry fit 0xFFFFFFFF into a u16? Maybe I’m misunderstanding.

My index buffer is 16-bit and has a value of 0xFFFF. I used `D3D12_INDEX_BUFFER_STRIP_CUT_VALUE_0xFFFFFFFF` for PSO creation and got it respecting the 16-bit value.

> Also similar to above, I’m not sure if alpha-to-coverage is necessary for the MVP.

Do you find it controversial? If not, let's have it in the MVP.

> It sounds like you’re asking for us to choose between two cases:

I see the question caused some confusion. Depth bounds test != depth test.
I voted for both to be explicit, potentially by using WebIDL optional dictionary entries/default values semantics.

Regards,
Dzmitry

On Mon, Sep 11, 2017 at 8:30 PM, Myles C. Maxfield <mmaxfield@apple.com<mailto:mmaxfield@apple.com>> wrote:



On Sep 8, 2017, at 12:24 PM, Corentin Wallez <cwallez@google.com<mailto:cwallez@google.com>> wrote:

Hey all,

While what goes into pipeline objects is mostly clear (see this doc<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fgpuweb%2Fgpuweb%2Fblob%2Fmaster%2Fdesign%2FPipelines.md&data=02%7C01%7Cbencon%40microsoft.com%7Cd51c41d5b52e432e4b3f08d4f97dfbdc%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636407766875906595&sdata=Siu6xnMPzneh5A%2FHjlMbm9I3W%2B83JYMUvRKGD7%2FCewU%3D&reserved=0>), there is still a bunch of open questions:

  *   How do we take advantage of the pipeline caching present in D3D12 and Vulkan? Do we expose it to the application or is it done magically in the WebGPU implementation?
No comment from us right now. We need more time to discuss internally.

  *   Should the type of the indices be set in RenderPipelineDescriptor? If not, how is the D3D12 IBStripCutValue chosen?
I’m not sure that triangle strips are necessary in the first place for the MVP.

Also, I’m not sure how Dzmitry fit 0xFFFFFFFF into a u16? Maybe I’m misunderstanding.

  *   Should the vertex attributes somehow be included in the PipelineLayout so vertex buffers are treated as other resources and changed in bulk with them?
I agree with the comments previously made here.


  *   Does the sample count of the pipeline state come from the RenderPass too?
I’m not sure what you mean by “come from.” Are you asking whether or not the render pass must include the sample count in addition to the pipeline state including the same sample count? If you are asking that, I don’t see why we should require redundant information. The sample count must at least be present in the pipeline state because that’s where Metal looks when it needs to configure the rasterizer.


  *   Should enablement of independent attachment blend state be explicit like in D3D12 or explicit? Should alpha to coverage be part of the multisample state or the blend state?
Similar to above, there’s no reason to require redundant information.

Also similar to above, I’m not sure if alpha-to-coverage is necessary for the MVP.


  *   About Vulkan’s VkPipelineDepthStencilStateCreateInfo::depthBoundTestEnable and D3D12's D3D12_DEPTH_STENCIL_DESC1::DepthBoundsTestEnable? Should “depth test enable” be implicit or explicit?
I want to be clear about what you’re asking. It sounds like you’re asking for us to choose between two cases:

  1.  A boolean variable (depthBoundTestEnable), which determines whether or not the implementation cares about another n-state variable (less-than, less-than-or-equal-to, equal, etc.)
  2.  An (n+1)-state variable (never, less-than, less-than-or-equal-to, equal, etc.)

I don’t think it matters very much, but option 2. seems cleaner.


What do you all think about these?

Corentin
Received on Wednesday, 13 September 2017 02:26:28 UTC