- From: David Neto <dneto@google.com>
- Date: Wed, 16 Aug 2017 19:24:46 +0000
- To: Kenneth Russell <kbr@google.com>, Corentin Wallez <cwallez@google.com>
- Cc: Maciej Stachowiak <mjs@apple.com>, "Myles C. Maxfield" <mmaxfield@apple.com>, public-gpu@w3.org, Dean Jackson <dino@apple.com>
- Message-ID: <CAPmVsJWz5L-YAgNkteqjgeDwQT=M_J7gbLFO5T79MXMLmYngpA@mail.gmail.com>
One more thing about Logical Addressing mode: - By design, you can always statically trace a memory access back to the original object allocation (an OpVariable) or to a function parameter. For example, that's why you can't get a "pointer" as the result of an OpPhi or an OpLoad, etc. - You may have to do interprocedural analysis to trace an access in a callee. To make static analysis easier you may have to do the moral equivalent of inlining. - Recursion is not allowed, and function pointers can't be expressed, so inlining always bottoms out. If it didn't come through before: I *agree* with Maciej in that *something* has to be specified at a higher level to ensure safety. This is true for any useful language. david On Wed, Aug 16, 2017 at 3:17 PM David Neto <dneto@google.com> wrote: > I see some good progress on this thread. I'll try to amplify and clarify > certain points. > > You can think of SPIR-V as a *family* of shading languages with common > fundamentals. Or even a *medium* for expressing shading languages. Much > like how most web pages are in UTF-8, you have to say more. > There are at least 3 varieties in the wild: SPIR-V for Vulkan, for OpenGL, > and OpenCL. We can get weird and count separately every single variation > where a particular capability or extension is enabled or not, but that's a > bit obtuse. > > More seriously: > There are lots of ways to generate undefined behaviour. This is true of > any minimally general language. I appeal to the halting problem, again. > Even division by zero is suspect, but I imagine we want to keep the > division operator. > It's up to the environment to say more about what happens in those cases, > if it wants to. We all agree that for the web we must provide robust > safety guarantees. > > It it's kind of a paradox of specs, but a specification says *nothing* > about what happens in the unspecified behaviour case. That's what makes it > *unspecified.* > > Substitute any other shading language XYZ for SPIR-V and the GPUWeb > environment still has to say more about what happens in the face of > ill-behaved programs. In a later step we can make a technical comparison > about the ease, robustness, and efficiency of applying those behavioural > constraints. > > I agree with Ken that we should make our requirements explicit. So far > I've seen at least: > - Access with program-determined indices into resources, in various stages > of the graphics pipeline. (E.g. general access to UBO, SSBO, images) > - Robust behaviour in the face of ill-behaved programs: E.g. in some > cases behave as if out of bounds accesses are clamped; in other cases > maybe not, including predictable termination of the application(?) without > further side effects or information leakage. (?) > - General arithmetic. I want my division instruction. :-) > > > On a more detailed level: > > - OpVectorInsertDynamic and OpVectorExtractDynamic operate on values, not > memory. So they're an even simpler case than memory and image accesses. > > - OpInBoundsAccessChain vs. OpAccessChain. Yes, the InBounds part is a > hint. More specifically the implementation does different things depending > on whether it trusts the shader: > - In all cases program behaviour is undefined if the resulting pointer > results in an out-of-object-bounds access > - If the implementation does not trust the shader, then the > implementation treats both instructions the same way: It always inserts > bounds checks (runtime or by modifying the shader) > - If the implementation trusts the shader, *and* safety is desired, > then the bounds checks can be elided in the case of OpInBoundAccessChain > - If the implementation trusts the shader and safety is not required > at this level of the implementation, then you elide the checks in both > cases. E.g. do this if the implemention delegates safety guarantees to a > more general level of protection, e.g. MMU hardware and its OS support > > - Effectively, there are no function pointers in SPIR-V, due to validation > rules > > > > On Wed, Aug 16, 2017 at 2:19 PM Kenneth Russell <kbr@google.com> wrote: > >> On Wed, Aug 16, 2017 at 10:06 AM, Corentin Wallez <cwallez@google.com> >> wrote: >> >>> On Wed, Aug 16, 2017 at 11:48 AM, Maciej Stachowiak <mjs@apple.com> >>> wrote: >>> >>>> >>>> I was prompted by off-list comments to read the SPIR-V spec and its >>>> explanation of Logical Addressing Mode in the spec here: < >>>> https://www.khronos.org/registry/spir-v/specs/1.2/SPIRV.html>. >>>> >>>> Thanks for taking an in-depth look at the logical addressing mode. >>> >>> >>>> It seems like SPIR-V's Logical Addressing Mode removes many of the >>>> obvious ways to do out-of-bounds reads or writes. It stops you from storing >>>> pointers and doing arithmetic on them; and it disables some opcodes that >>>> let you read or write with a program-controlled size, or get pointers to >>>> arbitrary locations. Many of these require the "Addresses" capability which >>>> is disabled in Logical Addressing Mode. >>>> >>>> However, the OpAccessChain operation, which lets you get a pointer to >>>> an arbitrary non-bounds-checled offset into a composite object, does >>>> *not* require the Addresses capability. It does not require any >>>> capability at all, which means there is no standard SPIR-V mechanism to >>>> disable or limit it. This is despite the existence of >>>> OpInBoundsAccessChain, which seems to do the same thing, but >>>> bounds-checked.[1] >>>> >>>> My understanding is that OpInBoundsAccessChain is a hint from the >>> application that no bounds check are necessary, but could be completely >>> ignored. >>> >>> >>>> There may be other operations that allow reading or writing memory >>>> without bounds checking. For example, OpImageRead, OpImageWrite, and >>>> related OpImage ops are not specified to do bounds checks, at least not >>>> obviously. Another example: OpVectorExtractDynamic, OpVectorInsertDynamic >>>> and related ops are not defined to bounds check, and explicitly specify >>>> that what is read or written is undefined if the index is out of bounds. >>>> >>>> >>>> Maybe I'm missing a simple reason why these ops are actually safe, or >>>> an easy way to add bounds checking. But I tentatively conclude that SPIR-V >>>> does *not* provide the required level of memory safety out of the box, >>>> not even in Logical Addressing Mode. >>>> >>>> Good catch for the OpImage* and OpVector*Dynamic. All of the ops >>> mentioned above are not safe because they access memory at a dynamic >>> offset. The capabilities they provide are critical for a GPU language: only >>> being able to access images and buffers at compile-time constant offsets >>> would be too restrictive. In that respect SPIR-V is no worse than any other >>> language we could choose, and has the nice property that in the logical >>> addressing mode, only a dozen ops might be unsafe. >>> >> >> The arguments to the image texel fetch operations would need to be >> clamped inside the shader, in the same way that buffer accesses might need >> to be, depending on the guarantees that the underlying APIs provide, and >> the decisions that are made about testability of WebGPU's memory safety. If >> for example WebGPU's semantics are defined as yielding either zero, or the >> value fetched at the clamped indices, for any out-of-range accesses against >> a buffer or image, and an underlying API like Direct3D 12 already provides >> those semantics, no additional bounds checks would need to be added to the >> compiled shader. If on the other hand the underlying API provides no >> guarantees, then the shader would need to be transformed during loading to >> fetch the size of the image using OpImageQuerySize and clamp the indices to >> that range. >> >> Note also that the typical way of sampling a texture is via >> OpImageSample*, which is guaranteed to be safe. >> >> >> >>> Therefore, as far as I can tell, my argument still stands. I believe we >>>> would need to significantly modify the SPIR-V format and SPIR-V >>>> implementations to achieve full memory safety. That's not to say it can't >>>> be done, but I don't think we should expect the resulting format to work >>>> with stock SPIR-V tools. >>>> >>> >> I disagree with this conclusion. It might be necessary to transform >> incoming shaders to insert clamping of some indices, or perhaps another >> approach would be needed. I disagree firmly with your conclusion that this >> would require significant modification of the SPIR-V format and toolchain >> to achieve. Apple contributed similar code to ANGLE's shader translator! >> See >> https://chromium.googlesource.com/angle/angle/+/master/src/third_party/compiler/ArrayBoundsClamper.cpp >> . This was a solid piece of work but didn't involve *any* modification of >> the GLES 1.00 spec which WebGL 1.0 shaders use essentially verbatim. It >> seems likely to me that the same properties would be achievable if SPIR-V >> were the shader format ingested by WebGPU implementations. >> >> Why don't we work on enumerating and documenting the security issues for >> shading languages/IRs that have been uncovered to date, like needing secure >> reads and writes for kernels' input and output buffers and texel fetches? >> Focusing on dismissing a potentially viable representation like SPIR-V at >> this point is like throwing the baby out with the bath water. That spec has >> had a ton of thought and experience put into it (as have DXIL and MSL) and >> aiming to dismiss it out of hand is counterproductive to the entire >> discussion. >> >> -Ken >> >> >> >>>> Regards, >>>> Maciej >>>> >>>> >>>> [1] I'm not absolutely sure that OpInBoundsAccessChain is >>>> bounds-checked because the definition is vague. It says: "Has the same >>>> semantics as OpAccessChain, with the addition that the resulting pointer is >>>> known to point within the base object." Does "known to" mean that the >>>> implementation bounds-checks and clamps the pointer to be within bounds >>>> (presumably with enough space left for the pointed-to type)? Or does it >>>> mean that the caller of this opcode promises that the pointer is within >>>> bounds? I'm assuming the former but it's not entirely clear. >>>> >>>> >>>> On Aug 15, 2017, at 2:41 PM, Maciej Stachowiak <mjs@apple.com> wrote: >>>> >>>> >>>> >>>> On Aug 15, 2017, at 2:29 PM, Kenneth Russell <kbr@google.com> wrote: >>>> >>>> On Tue, Aug 15, 2017 at 1:57 PM, Maciej Stachowiak <mjs@apple.com> >>>> wrote: >>>> >>>>> >>>>> Let's say we started with SPIR-V to make WebSPIR-V with at least the >>>>> following changes: >>>>> >>>>> - Some different APIs and interfaces are exposed than you would expect >>>>> in a WebGL or Vulkan shader. >>>>> - At compile-time we generate code with runtime bounds checks and >>>>> other safety features. >>>>> - We possibly subset SPIR-V to remove dangerous capabilities that >>>>> can't be effectively guarded with bounds checks or other runtime checks. >>>>> >>>>> If we did all these things, then it seems to me we wouldn't benefit >>>>> from the existing SPIR-V ecosystem: >>>>> - Existing SPIR-V shaders could not be reused with WebGPU >>>>> - Front ends that compile to SPIR-V could not be used unmodified >>>>> - Existing SPIR-V back ends could not correctly process the modified >>>>> language, since they wouldn't have the ability to do runtime checks.[1] >>>>> >>>>> Is my analysis correct? If so, then what is the benefit of basing on >>>>> SPIR-V? >>>>> >>>> >>>> This analysis is incorrect and oversimplifies the issues in order to >>>> cast SPIR-V in a negative light. >>>> >>>> The problematic areas are mainly in bounds-checking accesses in the >>>> input and output buffers to compute shaders. Pipeline stages like vertex, >>>> geometry, and fragment shaders are well and securely handled by the >>>> capabilities SPIR-V exposes, since it was designed as a shading language >>>> for graphics APIs. >>>> >>>> >>>> My point isn't to make SPIR-V look bad. I don't believe there's any >>>> shader language that meets all our requirements as-is. >>>> >>>> My intent is only to argue that it shouldn't have a leg up based on >>>> ecosystem considerations, because we'd likely have to modify it enough that >>>> the existing SPIR-V ecosystem won't be compatible with the end result. >>>> Despite your disagreement, I still think that's probably true. >>>> >>>> I'm not clear on how SPIR-V avoids having a bounds checking problem for >>>> shader types other than compute shaders. Are they unable to use the >>>> non-bounds-checked operations that compute shaders can use? Do they only >>>> get access to fixed-size arrays? Do they only get range-checked buffer >>>> operations? To be clear, I would consider anything that can be used to read >>>> or write arbitrary locations in video memory to be a security risk. Even if >>>> it's limited to the video memory of a specific process. >>>> >>>> If I'm missing something that makes this a non-issue, please explain. >>>> >>>> Instead of attempting to dismiss one particular solution a priori, what >>>> should be done is to enumerate the needed capabilities of the various kinds >>>> of shaders in WebGPU, and see what would need to be done to the starting >>>> point (Metal Shading Language, WebAssembly, DXIL, SPIR-V, a higher-level >>>> language, something brand new...) in order to have it be secure and >>>> performant. There are multiple interesting analyses, designs and >>>> comparisons to be done. >>>> >>>> >>>> I agree that this should be the approach. I did not mean to dismiss >>>> SPIR-V, merely to argue that it should not be considered to have any >>>> particular advantage over the other options. Except perhaps on intrinsic >>>> technical merit, which I think is the point of the overall analysis. It >>>> might be that a modified version of SPIR-V is sound technical choice >>>> whether or not it interoperates with regular SPIR-V. >>>> >>>> Regards, >>>> Maciej >>>> >>>> >>>> >>>
Received on Wednesday, 16 August 2017 19:25:24 UTC