Re: NXT design for memory barriers and buffer mapping.

From: Dzmitry Malyshau <dmalyshau@mozilla.com>
Date: Fri, 17 Nov 2017 12:29:14 -0500
Message-ID: <CAHnMvn+YT-isPt2Ly=A=vcY+BZS_6p-4gkVVwzM=1fubvmC+RA@mail.gmail.com>
To: Corentin Wallez <cwallez@google.com>
Cc: public-gpu <public-gpu@w3.org>

Thank you for clarification! It does make more sense now :)

The point still stands though: is there a strong reason from having
"immediate transition" concept at all as a part of the API if the user can
trivially do the same with a simple function?


On Thu, Nov 16, 2017 at 4:56 PM, Corentin Wallez <cwallez@google.com> wrote:

> On Thu, Nov 16, 2017 at 4:24 PM, Dzmitry Malyshau <dmalyshau@mozilla.com>
> wrote:
>> Hi Corentin,
>> As promised on the call, here is some feedback. This is only about memory
>> barriers.
>> > Image swizzling and transitions between different types of swizzling.
>> I'm confused why you call them "swizzling". Swizzling is changing order
>> of elements in a vector.
>> Resource transitions can imply many things, but I don't think swizzling
>> is involved, unless you meant some broader sense of the word?
> This has a different name in different APIs. In Vulkan these are image
> layout transitions.
>> > Each command has an immediate version as well as a buffered version.
>> Having two different ways to transition resources feels redundant and
>> confusing to me.
>> As a user (or better, code reviewer), how do I reason about the code?
>> Say, I want to know what a resource state is at some point of the
>> execution. Somehow, not only I need to be aware of all the associated
>> transitions in the command buffers that are executing or executed (this is
>> what Vulkan/D3D12 do, anyway), but I also have to consider all the
>> "buffered" transitions. These are especially hard, given that they can
>> happen in one of the 3 mentioned spots (instantly, at the end of some
>> command buffer, at some queue submission).
>> The outcome would be users calling transitions "just in case, to be sure"
>> for everything they do, effectively questioning if they should be
>> responsible for transitions in the first place (basically, proving the
>> Apple's position). And on the other hand, if it *is* clear for the user
>> what is the resource state at some particular point, then they should have
>> no trouble specifying it in the API (encoding the assumption without
>> guessing), like what they do on Vulkan/D3D12 today.
>> Maybe this part wasn't too clear:
>  - The "buffered transition" and "transition in command buffers" are the
> same concept. The transitions are "buffered" because they are put in a
> "command buffer" and their effect applies only when that command buffer is
> submitted. Exactly the same as Vulkan.
>  - The "immediate transition" is as if a command buffer containing a
> single "buffered transition" was created and submitted instantly.
> Hopefully this changes your concern about code review. The only time when
> these buffered transition happen is at queue submit. The three spots you
> mention are the places where validation occurs but not when the effects
> happen.
>> > No usage transition can happen in a render pass
>> This restriction, combined with a single mutable state allowed for a
>> resource, would prevent a case like this:
>>   subpass A: write to UAV buffer
>>   subpass B: uses it as an index/vertex buffer for one of the calls
>>   dependency: A -> B
> Sorry, we meant subpasses. Interaction with render passes are in the "open
> questions" section.
>> > If WebGPU has Vulkan style tile control, usage transitions inside
>> render passes won’t be possible because they semantically apply instantly.
>> I don't understand this part. Could you elaborate?
>> NXT's "usage transition" semantically apply instantly, which doesn't work
> great when sub passes in a render pass can be reordered.
>> > Vulkan-style memory barriers without validation
>> > Would be expensive to add back validation.
>> I could argue that adding validation is easier than inferring the
>> transitions in the first place. I'll work on a concrete algorithmic
>> proposal to track resource transitions in order to support that hypothesis.
>> It would be extremely similar to what NXT is doing.
>> > Implicit memory barriers (like Metal)
>> > Arguably no simpler than what’s presented here
>> Hmm, I disagree. It is much simpler when API exposes no concept of
>> transitions.
>> > Implementation overhead increases a lot on D3D12 and Vulkan
>> I wish we had any quantitative metric to support that.
>> Cheers,
>> Dzmitry
>> On Tue, Nov 14, 2017 at 11:51 PM, Corentin Wallez <cwallez@google.com>
>> wrote:
>>> Hey all,
>>> We wrote some document to help everyone reason about NXT's proposals for
>>> memory barriers and resource upload /download. Unfortunately we still don't
>>> have a fleshed out proposal that minimizes the number of copies on UMA.
>>> Instead the docs focus on explaining our current design for resource
>>> upload/download and for memory barriers since they are very tied.
>>> Eventually we'll have these docs in MarkDown in some repo, either WebGPU's
>>> or NXT's.
>>>    - NXT "memory barriers"
>>>    <https://docs.google.com/document/d/1k7lPmxP7M7MMQR4g210lNC5TPwmXCMLgKOQWNiuJxzA>
>>>    <- Please read this first as buffer mapping depends on it.
>>>    - NXT buffer mapping
>>>    <https://docs.google.com/document/d/1HFzMMvDGHFtTgjNT0j-0SQ1fNU9R7woZ4JuNJdAXBjg>
>>> Cheers,
>>> Corentin
