Re: Notes about Secure HLSL from Maciej Stachowiak on 2017-12-12 (public-gpu@w3.org from December 2017)

From: Maciej Stachowiak <mjs@apple.com>
Date: Tue, 12 Dec 2017 14:47:34 -0800
To: David Neto <dneto@google.com>
Cc: Corentin Wallez <cwallez@google.com>, Gregg Tavares <w3c@greggman.com>, "Myles C. Maxfield" <mmaxfield@apple.com>, Dzmitry Malyshau <dmalyshau@mozilla.com>, public-gpu <public-gpu@w3.org>
Message-id: <8B242BAC-B393-4D71-849D-993D96BFCACA@apple.com>
> On Dec 12, 2017, at 2:34 PM, David Neto <dneto@google.com> wrote:
> 
> 
> 
> On Tue, Dec 12, 2017 at 5:05 PM Maciej Stachowiak <mjs@apple.com <mailto:mjs@apple.com>> wrote:
> 
>> On Dec 12, 2017, at 1:42 PM, David Neto <dneto@google.com <mailto:dneto@google.com>> wrote:
>> 
>> Myles,
>> 
>> Thanks for posting something concrete we can talk about.
>> I've pasted my notes from a first quick pass over the doc.  I'm not looking for specific replies, but offer it for your consideration.
>> 
>> Overall I find it underspecified, especially among the many new features, including how new features interact with each other.  Separately, some things will incur a high cost.  Some requirements appear to be unimplementable on GLSL or Vulkan (even extended with variable-pointers).
>> 
> 
> Thanks for the review. We obviously need to ultimately produce a much more rigorous spec; this document is just an explanatory overview.
> 
> Understood.  And now you have my first level of feedback on that  later spec. 

Much appreciated.

> 
> Can you be specific about which requirements you think are unimplementable in Vulkan (even extended with variable-pointers)? The only concrete issue I see below is SPIR-V.
> 
> Even in the presence of the variable-pointers extension, I think the trap-like behaviour for dereferencing a null pointer is impossible to implement in Vulkan:
>  - You can't compare pointers.  So no comparison vs. null
>  - You can't dreference a null pointer:  Undefined behaviour results, so you can't trap.
>  - So there's no way to implement the "write zeros to output and exit now" behaviour.
> Basically, null pointers are placeholder values, but you can't detect them.  You can only forget about them by using a pointer to a real object instead.

Are issues surrounding null pointers the only thing you are concerned about? Or are there other problems you foresee?

> 
> We have built a compiler for Secure HLSL Logical Mode that targets SPIR-V Logical Mode, so we have good evidence that at least that subset can target Vulkan. I am guessing it must be about issues that only apply when the Logical Mode restrictions are not in effect. 
> 
> Verifying compilers is hard.  :-)
> 
> It's hard to say if your compiler is correct.  It's less hard to say if the output is valid SPIR-V for Vulkan (with or without extensions), and even then the current validator is incomplete.  So you could be generating code which is invalid by the target spec but not yet caught by any independent validator.  And if you are successfully executing the target code on multiple implementations, it could still be invalid.
> 
> This issue isn't specific to your proposal, it's a general point.  So I'm not asking for anything at this time.  :-)
> 
> cheers,
> david
> 
>  
> 
>> thanks
>> david
>> 
>> -----
>> 
>> Variables:
>>   Where can you define a variable?
>>   Can you reference a variable before its declaration:
>>      {   int y = x;  
>>          int x;  
>>          int z = y;  }  // Require that z = 0?
>> 
>> Initializers: 
>>  - Allow interedependence?
>>  - Initializers are constants?
>>    - Want to forbid calling a function in an initializer. (No constexpr)
>>  - Should say: re-initialzed each time execution enters the scope
>> 
>> Share storage between different shader stages?
>>  - Haven't defined the lifetime of the program w.r.t. shader stages or the pipeline
>>  - Possible leakage of values across thread-groups
>>  - Possible leakage of values across shader stages
>> 
>> Output variables: are they in thread memory space? (I assume so.)
>> 
>> Pointers: What operations are permitted?  Better to have a whitelist
>>  - E.g. Dereference for load and store
>> 	- But not for null
>>  - Others: access-chain, "slide", compare?
>>  - Basically, need to define the algebra for pointers
>> 
>> Array reference:
>>  - You introduced with full generality, and then restricted later.
>>    - Not knowing which underlying array
>>      - Requires variable-pointers extension, and only works with storage-buffer
>>        or workgroup-shared (depending on the exact capability)
>> 
>>  - various conversion operations on array references are not permitted in GLSL
>>   without extension
>> 
>>  - @ operator on a "value".   ?? Did you mean variable?
>>     Otherwise you're forcing creation of a new storage location.
>> 
>> "Safe" pointer: 
>>  - Indexing into array reference can result in a pointer pointing at invalid
>>   storage. So you can't say it's always either null or pointing at a valid object
>> 
>> Out of bounds accesses:
>>  - Requirement that all previous writes must complete: Therefore any possibly
>>   out-of-bounds operations can't be combined or reordered.  That's a big
>>   hinderance to optimization, e.g vectorization.
>> 
>> What use is a null pointer?
>>  - You can't access it, you can't compare it.
>>    - Sketch: Possible translation for "null" is a pointer to a phantom variable, per type.  Except even then you can't detect if it's that special pointer, so I don't know how to enforce the requested early termination.
>>  - But with null in the language at least you get known result, i.e. requested early termination.
>> 
>>  - What is the type of null?
>>  - I assume this language is statically typed.
>> 
>> "Semantic errors inside generic functions show up once regardless of the number of times the generic function is instantiated."
>> - That's a tooling concern, not a language concern.
>> 
>> Generics:
>>  Example   T identity<T>(T value)....
>>     What are the lexical / symbol constraints on T.  Can't be another keyword,
>>   already defined symbol, ..... ?
>> 
>> Constant expressions passed as type arguments:
>>  "Only literals and references to other constant parameters qualify."
>>   - Which others?  Earlier parameters?  Apparent circularity here.
>> 
>> What do protocols contain?
>>   - Just method declarations?
>>   - Member declarations?
>> 
>> Operator overloading:
>>   - Casting syntax given as "type(vlaue)"
>>     What about arrays.  Array references.
>>   - Are protocols types?
>> 
>> How do templates mix with protocols?
>> 
>> Operator overloading:
>>   - If  prefix increment is same as postfix increment, just ban one of them.
>>   - The example for ++ for int is apparently wrong: does not modify the value.
>> 
>> Default values:
>>   - Do functions have types?
>> 
>> Compound overloads such as   :
>>   foo.doubleValue *= 2  ; must specify order of evaluation
>> 
>> Function overload resolution:
>>  - Are the usual arithmetic conversions peformed?  And how do they figure into
>>   ranks of specificity?
>>    By usual arithmetic conversions, foo(1u) resolves to both 
>>      foo<T>(T) and foo(float) (The latter by a single step in usual arith conversion lattice.)
>>    
>>  - What is the type of "1"
>>    If all you have is foo(uint), does foo(1) compile?  Why?
>> 
>>  - "specificity rank" is a partial order.  May as well say so.
>> 
>>  - With the single-most-specific overload requirement, it's possible to break
>>   code by introducing a new type or protocol.  Very non-local effect.  I think
>>   that could be ok.
>> 
>> Concurrency:
>>  - Are there storage images in WebGPU? (Question for the community group.)
>>  - This section is thin.
>>  - "inteleaved or concurrent"
>>    Interleaving = sequential consistency, it seems. That's super-strong.
>>    "Concurrent" is underspecified.  Defined behaviour in all cases for all writes?
>>       In real life, likely to have unspecified behaviour.
>> 
>> Logical mode:
>>   - Pointers may never be assigned to.  Is parameter passing different or the same?
>> 
>>   - Ternary expressions:  This is the first and only mention in the doc that this
>>    exists.
>> 
>> Denorm flushing:  That's expensive to guarantee after every op.
>> 
>> Divergent control flow:  Haven't defined it, nor when reconvergence can reoccur.
>> 
>> Nothing about textures or images: sampled images or storage images
>>  
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> On Mon, Dec 11, 2017 at 3:02 PM Corentin Wallez <cwallez@google.com <mailto:cwallez@google.com>> wrote:
>> Thanks Myles for the document, here are my notes too, both on the document and some of the discussions on this thread.
>> 
>> ## Preprocessor and multiple entry-points
>> 
>> Not having #includes seems ok because you can do it in Javascript easily. I'd argue that for the same reasons the rest of the preprocessor isn't needed either, in particular since there isn't a good spec for how the C preprocessor works which would be a source of incompatibilities.
>> 
>> Having multiple entry-points of different stages sounds ok, and is supported in SPIR-V.
>> 
>> My suggestion for code reuse it to expose linking at that API-level. Some modules could be pure-libraries and contain no entry-points while others contain entry-points but some declared symbols are unimplemented. Module linking could produces fully implemented modules with entry-points which can be used for pipeline creation. Such linking would allow factoring of the validation and translation cost too, and maybe some of the native shader compilation cost.
>> 
>> ## Builtin types
>> 
>> Why remove the [RW]ByteAddressBuffer? It is important to allow heterogeneous data in buffers. It could be done with StructuredBuffer<uint32_t> and casts but that's not ideal imho.
>> 
>> ## Variables
>> 
>> Having things "as-if" local variables have a global lifetime sounds ok. It corresponds how things work in native APIs where program-global register allocation happens, and allows taking pointers to the variables (which would then have their lifetime extended to that of the pointer).
>> 
>> ## Safe pointers and array references.
>> 
>> The SPIR-V logical addressing mode doesn't allow null pointers, how would this feature be translated? I suggest that pointers should always be initialized and "null" doesn't exist.
>> 
>> Array-references are syntactic sugar around global arrays so I don't think they are needed. Also there's a way to implement them for "thread" and "threadgroup" address spaces (the only spaces they can be used with in your proposal) such that they are assignable and translate correctly to SPIR-V.
>> 
>> It is unlikely that we can require KHR_variable_pointer anytime soon, so the content of the "Logical Mode" could be merged in the pointer and arre-ref sections and treated as a hard constraint.
>> 
>> ## Out Of Bounds accesses
>> 
>> The trapping behavior described sounds extremely expensive to implement and doesn't even match with the "discard" fragment shader operation. Penalizing correct shaders to implement the trapping doesn't sound good and imo this is a place where having a "one of the following happens" statement would be ok. Same thing for NaN propagation and denorm floats.
>> 
>> ## Syntax features
>> 
>> No comment on the design of these. Their complexity reinforces my concerns about having incompatible implementations.
>> 
>> ## High-level takeway
>> 
>> The document presents an HLSL++ that removes some of the ancient stuff of HLSL and adds both pure-syntax features and a safe-pointer feature. There's a small incompatibility with SPIR-V that (null pointers) that would be easy to resolve.
>> 
>> Out of all the additional features, only the safe pointers are compelling since they actually expose more features of the underlying platform. All the others increase the complexity of the language compared to HLSL resulting in even more interoperability concerns.
Received on Tuesday, 12 December 2017 22:48:05 UTC