Re: RFC: Re: shader IR vs. application IR from Maciej Stachowiak on 2017-08-16 (public-gpu@w3.org from August 2017)

From: Maciej Stachowiak <mjs@apple.com>
Date: Wed, 16 Aug 2017 08:48:44 -0700
To: Kenneth Russell <kbr@google.com>
Cc: David Neto <dneto@google.com>, "Myles C. Maxfield" <mmaxfield@apple.com>, public-gpu@w3.org, Dean Jackson <dino@apple.com>
Message-id: <8A5642D9-2C99-45A4-9A80-12113B38B1E6@apple.com>
I was prompted by off-list comments to read the SPIR-V spec and its explanation of Logical Addressing Mode in the spec here: <https://www.khronos.org/registry/spir-v/specs/1.2/SPIRV.html <https://www.khronos.org/registry/spir-v/specs/1.2/SPIRV.html>>.

It seems like SPIR-V's Logical Addressing Mode removes many of the obvious ways to do out-of-bounds reads or writes. It stops you from storing pointers and doing arithmetic on them; and it disables some opcodes that let you read or write with a program-controlled size, or get pointers to arbitrary locations. Many of these require the "Addresses" capability which is disabled in Logical Addressing Mode.

However, the OpAccessChain operation, which lets you get a pointer to an arbitrary non-bounds-checled offset into a composite object, does not require the Addresses capability. It does not require any capability at all, which means there is no standard SPIR-V mechanism to disable or limit it. This is despite the existence of OpInBoundsAccessChain, which seems to do the same thing, but bounds-checked.[1]

There may be other operations that allow reading or writing memory without bounds checking. For example, OpImageRead, OpImageWrite, and related OpImage ops are not specified to do bounds checks, at least not obviously. Another example: OpVectorExtractDynamic, OpVectorInsertDynamic and related ops are not defined to bounds check, and explicitly specify that what is read or written is undefined if the index is out of bounds.


Maybe I'm missing a simple reason why these ops are actually safe, or an easy way to add bounds checking. But I tentatively conclude that SPIR-V does not provide the required level of memory safety out of the box, not even in Logical Addressing Mode. 

Therefore, as far as I can tell, my argument still stands. I believe we would need to significantly modify the SPIR-V format and SPIR-V implementations to achieve full memory safety. That's not to say it can't be done, but I don't think we should expect the resulting format to work with stock SPIR-V tools.

Regards,
Maciej


[1] I'm not absolutely sure that OpInBoundsAccessChain is bounds-checked because the definition is vague. It says: "Has the same semantics as OpAccessChain, with the addition that the resulting pointer is known to point within the base object." Does "known to" mean that the implementation bounds-checks and clamps the pointer to be within bounds (presumably with enough space left for the pointed-to type)? Or does it mean that the caller of this opcode promises that the pointer is within bounds? I'm assuming the former but it's not entirely clear.


> On Aug 15, 2017, at 2:41 PM, Maciej Stachowiak <mjs@apple.com> wrote:
> 
> 
> 
>> On Aug 15, 2017, at 2:29 PM, Kenneth Russell <kbr@google.com <mailto:kbr@google.com>> wrote:
>> 
>> On Tue, Aug 15, 2017 at 1:57 PM, Maciej Stachowiak <mjs@apple.com <mailto:mjs@apple.com>> wrote:
>> 
>> Let's say we started with SPIR-V to make WebSPIR-V with at least the following changes:
>> 
>> - Some different APIs and interfaces are exposed than you would expect in a WebGL or Vulkan shader.
>> - At compile-time we generate code with runtime bounds checks and other safety features.
>> - We possibly subset SPIR-V to remove dangerous capabilities that can't be effectively guarded with bounds checks or other runtime checks.
>> 
>> If we did all these things, then it seems to me we wouldn't benefit from the existing SPIR-V ecosystem:
>> - Existing SPIR-V shaders could not be reused with WebGPU
>> - Front ends that compile to SPIR-V could not be used unmodified
>> - Existing SPIR-V back ends could not correctly process the modified language, since they wouldn't have the ability to do runtime checks.[1]
>> 
>> Is my analysis correct? If so, then what is the benefit of basing on SPIR-V?
>> 
>> This analysis is incorrect and oversimplifies the issues in order to cast SPIR-V in a negative light.
>> 
>> The problematic areas are mainly in bounds-checking accesses in the input and output buffers to compute shaders. Pipeline stages like vertex, geometry, and fragment shaders are well and securely handled by the capabilities SPIR-V exposes, since it was designed as a shading language for graphics APIs.
> 
> My point isn't to make SPIR-V look bad. I don't believe there's any shader language that meets all our requirements as-is.
> 
> My intent is only to argue that it shouldn't have a leg up based on ecosystem considerations, because we'd likely have to modify it enough that the existing SPIR-V ecosystem won't be compatible with the end result. Despite your disagreement, I still think that's probably true.
> 
> I'm not clear on how SPIR-V avoids having a bounds checking problem for shader types other than compute shaders. Are they unable to use the non-bounds-checked operations that compute shaders can use? Do they only get access to fixed-size arrays? Do they only get range-checked buffer operations? To be clear, I would consider anything that can be used to read or write arbitrary locations in video memory to be a security risk. Even if it's limited to the video memory of a specific process.
> 
> If I'm missing something that makes this a non-issue, please explain.
> 
>> Instead of attempting to dismiss one particular solution a priori, what should be done is to enumerate the needed capabilities of the various kinds of shaders in WebGPU, and see what would need to be done to the starting point (Metal Shading Language, WebAssembly, DXIL, SPIR-V, a higher-level language, something brand new...) in order to have it be secure and performant. There are multiple interesting analyses, designs and comparisons to be done.
> 
> I agree that this should be the approach. I did not mean to dismiss SPIR-V, merely to argue that it should not be considered to have any particular advantage over the other options. Except perhaps on intrinsic technical merit, which I think is the point of the overall analysis. It might be that a modified version of SPIR-V is sound technical choice whether or not it interoperates with regular SPIR-V. 
> 
> Regards,
> Maciej
Received on Wednesday, 16 August 2017 15:49:27 UTC