Re: NXT design for memory barriers and buffer mapping. from Corentin Wallez on 2017-11-17 (public-gpu@w3.org from November 2017)

From: Corentin Wallez <cwallez@google.com>
Date: Fri, 17 Nov 2017 11:12:46 -0800
To: Dzmitry Malyshau <dmalyshau@mozilla.com>
Cc: public-gpu <public-gpu@w3.org>
Message-ID: <CAGdfWNPqEdWsWGRCph6Lf=LK5WnZLk7ABz2=93LNz4mT+fMbyQ@mail.gmail.com>
No problem. The reason for having "immediate transitions" is they can be
buffered and submitted together on the next relevant queue submit to
produce more optimal backend calls. You're right that this could be handled
by the app but it would be non-trivial. The other reason is that when we
add multi-queue support, immediate transitions will also allow moving
resources between queues, which cannot be buffered commands.

On Fri, Nov 17, 2017 at 9:29 AM, Dzmitry Malyshau <dmalyshau@mozilla.com>
wrote:

> Corentin,
>
> Thank you for clarification! It does make more sense now :)
>
> The point still stands though: is there a strong reason from having
> "immediate transition" concept at all as a part of the API if the user can
> trivially do the same with a simple function?
>
> Thanks,
> -Dzmitry
>
>
>
> On Thu, Nov 16, 2017 at 4:56 PM, Corentin Wallez <cwallez@google.com>
> wrote:
>
>> On Thu, Nov 16, 2017 at 4:24 PM, Dzmitry Malyshau <dmalyshau@mozilla.com>
>> wrote:
>>
>>> Hi Corentin,
>>>
>>> As promised on the call, here is some feedback. This is only about
>>> memory barriers.
>>>
>>> > Image swizzling and transitions between different types of swizzling.
>>>
>>> I'm confused why you call them "swizzling". Swizzling is changing order
>>> of elements in a vector.
>>> Resource transitions can imply many things, but I don't think swizzling
>>> is involved, unless you meant some broader sense of the word?
>>>
>>
>> This has a different name in different APIs. In Vulkan these are image
>> layout transitions.
>>
>>
>>> > Each command has an immediate version as well as a buffered version.
>>>
>>> Having two different ways to transition resources feels redundant and
>>> confusing to me.
>>> As a user (or better, code reviewer), how do I reason about the code?
>>> Say, I want to know what a resource state is at some point of the
>>> execution. Somehow, not only I need to be aware of all the associated
>>> transitions in the command buffers that are executing or executed (this is
>>> what Vulkan/D3D12 do, anyway), but I also have to consider all the
>>> "buffered" transitions. These are especially hard, given that they can
>>> happen in one of the 3 mentioned spots (instantly, at the end of some
>>> command buffer, at some queue submission).
>>>
>>> The outcome would be users calling transitions "just in case, to be
>>> sure" for everything they do, effectively questioning if they should be
>>> responsible for transitions in the first place (basically, proving the
>>> Apple's position). And on the other hand, if it *is* clear for the user
>>> what is the resource state at some particular point, then they should have
>>> no trouble specifying it in the API (encoding the assumption without
>>> guessing), like what they do on Vulkan/D3D12 today.
>>>
>>> Maybe this part wasn't too clear:
>>  - The "buffered transition" and "transition in command buffers" are the
>> same concept. The transitions are "buffered" because they are put in a
>> "command buffer" and their effect applies only when that command buffer is
>> submitted. Exactly the same as Vulkan.
>>  - The "immediate transition" is as if a command buffer containing a
>> single "buffered transition" was created and submitted instantly.
>>
>> Hopefully this changes your concern about code review. The only time when
>> these buffered transition happen is at queue submit. The three spots you
>> mention are the places where validation occurs but not when the effects
>> happen.
>>
>>
>>> > No usage transition can happen in a render pass
>>>
>>> This restriction, combined with a single mutable state allowed for a
>>> resource, would prevent a case like this:
>>>   subpass A: write to UAV buffer
>>>   subpass B: uses it as an index/vertex buffer for one of the calls
>>>   dependency: A -> B
>>>
>>>
>> Sorry, we meant subpasses. Interaction with render passes are in the
>> "open questions" section.
>>
>>
>>> > If WebGPU has Vulkan style tile control, usage transitions inside
>>> render passes won’t be possible because they semantically apply instantly.
>>>
>>> I don't understand this part. Could you elaborate?
>>>
>>> NXT's "usage transition" semantically apply instantly, which doesn't
>> work great when sub passes in a render pass can be reordered.
>>
>>
>>> > Vulkan-style memory barriers without validation
>>> > Would be expensive to add back validation.
>>>
>>> I could argue that adding validation is easier than inferring the
>>> transitions in the first place. I'll work on a concrete algorithmic
>>> proposal to track resource transitions in order to support that hypothesis.
>>>
>>> It would be extremely similar to what NXT is doing.
>>
>>
>>> > Implicit memory barriers (like Metal)
>>> > Arguably no simpler than what’s presented here
>>>
>>> Hmm, I disagree. It is much simpler when API exposes no concept of
>>> transitions.
>>>
>>
>>> > Implementation overhead increases a lot on D3D12 and Vulkan
>>>
>>> I wish we had any quantitative metric to support that.
>>>
>>> Cheers,
>>> Dzmitry
>>>
>>> On Tue, Nov 14, 2017 at 11:51 PM, Corentin Wallez <cwallez@google.com>
>>> wrote:
>>>
>>>> Hey all,
>>>>
>>>> We wrote some document to help everyone reason about NXT's proposals
>>>> for memory barriers and resource upload /download. Unfortunately we still
>>>> don't have a fleshed out proposal that minimizes the number of copies on
>>>> UMA. Instead the docs focus on explaining our current design for resource
>>>> upload/download and for memory barriers since they are very tied.
>>>> Eventually we'll have these docs in MarkDown in some repo, either WebGPU's
>>>> or NXT's.
>>>>
>>>>    - NXT "memory barriers"
>>>>    <https://docs.google.com/document/d/1k7lPmxP7M7MMQR4g210lNC5TPwmXCMLgKOQWNiuJxzA>
>>>>    <- Please read this first as buffer mapping depends on it.
>>>>    - NXT buffer mapping
>>>>    <https://docs.google.com/document/d/1HFzMMvDGHFtTgjNT0j-0SQ1fNU9R7woZ4JuNJdAXBjg>
>>>>
>>>> Cheers,
>>>>
>>>> Corentin
>>>>
>>>
>>>
>>
>
Received on Friday, 17 November 2017 19:13:31 UTC