Re: WHLSL Compatibility with HLSL from Dzmitry Malyshau on 2018-11-28 (public-gpu@w3.org from November 2018)

From: Dzmitry Malyshau <dmalyshau@mozilla.com>
Date: Tue, 27 Nov 2018 22:16:00 -0500
To: "Myles C. Maxfield" <mmaxfield@apple.com>
Cc: public-gpu@w3.org
Message-ID: <19a17656-d075-c5ec-e081-388c09ff88b5@mozilla.com>
Miles,

Apologies, the HLSL semantics link/point is understood now. Please 
disregard my answer about this.

-Dzmitry

On 11/27/18 10:12 PM, Dzmitry Malyshau wrote:
>
> Hi Miles,
>
>
>>>
>>> Hi Myles,
>>>
>>> Thank you for sharing!
>>>
>>> Reading about the vast field of syntax features (preprocessor, array 
>>> syntax, call syntax, float literals, casts, etc) that raise 
>>> questions but ultimately not required to be specially handled by 
>>> WebGPU implementations only strengthens the point that human 
>>> writable format has different requirements from 
>>> implementation-digestable one,
>>>
>> I don’t understand this argument. SPIR-V has array syntax, float 
>> literals, and casts too.
>
>
> Well, the problem with HLSL is that there are different array syntaxes 
> possible, e.g. "float myArray[40]"  versus "float[40] myArray", while 
> SPIR-V has one and only syntax. Same for casts.
>
> For float literals, in SPIR-V there isn't a question whether exponent 
> notation needs to be supported.
>
>
>>> and one way or another we are approaching the point where some build 
>>> step is required before the actual authored shader source gets to 
>>> the backend.
>>>
>> We’re modifying WHLSL to accept existing HLSL programs. We’re not 
>> expecting web authors to run a build step to produce WHLSL from their 
>> HLSL source.
>
>
> On the last call we were elaborating about which side needs to run the 
> preprocessor. It's effectively a build step.
>
>
>>> > It appears that HLSL allows any arbitrary semantic for stage 
>>> in/out parameters. Around 30% of the corpus uses a semantic that 
>>> WHLSL doesn’t currently accept.
>>>
>>> In HLSL the semantic name is basically the "port to the outside". If 
>>> it's built-in, then the "outside" is the graphics/compute pipeline, 
>>> if it's user-defined, then it's the user code that can query the 
>>> names. I didn't realize this would be different for WHLSL.
>>>
>>>
>>
>> I don’t quite understand what you are describing. In any modern 3D 
>> graphics API, there is linkage between the graphics API and the 
>> inputs/outputs (MSL calls these [[ attribute(n) ]]) and between the 
>> vertex & fragment shaders (MSL calls these [[ user(n) ]]). When 
>> researching HLSL, I thought the list in the docs 
>> <https://docs.microsoft.com/en-us/windows/desktop/direct3dhlsl/dx-graphics-hlsl-semantics> was 
>> exhaustive, but it looks like it isn’t.
>
>
> This link is for DX9 semantics, which is long deprecated. In 
> non-compat (in the sense of DX9 compatibility) HLSL all the built-in 
> semantics starts with "SV_", and everything else is user defined. Does 
> WHLSL not have user-defined semantics?
>
>
>>> > There are a whole collection of variable modifiers that HLSL 
>>> sources use that WHLSL doesn’t accept. E.g. row_major float4x4 
>>> mvpMatrix;
>>>
>>> Is this on your radar?
>>>
>> Not quite sure what this question means. Yes?
>>>
>>> > HLSL has some hints about how the compiler should treat branches 
>>> and loops. They look like [ unroll ] for (…) and [ branch ] if ( …). 
>>> I don’t think these have semantic meaning so we can probably just 
>>> swallow them.
>>>
>>> I vaguely recall this to affect ERR_GRADIENT_FLOW**errors (aka 
>>> "Gradient operations can't occur inside loops with divergent flow 
>>> control."). So it may be more than just a hint.
>>>
>> Thanks for the tip; we’ll investigate this.
>>>
>>> Cheers,
>>>
>>> Dzmitry
>>>
>>> On 11/27/18 5:29 PM, Myles C. Maxfield wrote:
>>>> Over the past few days, I’ve collected a large corpus of HLSL files 
>>>> so we can determine what we need to do to be source-compatible with 
>>>> existing HLSL source.
>>>>
>>>> *The Corpus*
>>>>
>>>> I wrote a GitHub crawler which looked for repositories that had 
>>>> many HLSL files in them. I looked over the results of this crawler 
>>>> and hand-picked a few repositories that are from respectable 
>>>> sources. In total, we ended up with 2099 HLSL files.
>>>>
>>>> The list of repositories:
>>>>
>>>>   * Microsoft/DirectX-Graphics-Samples
>>>>   * vvvv/vvvv-sdk
>>>>       o Of limited use, because most of the source is written in
>>>>         another language (Effects) which includes HLSL snippets.
>>>>         GitHub classifies this as HLSL.
>>>>   * Unity-Technologies/ScriptableRenderPipeline
>>>>       o Of limited use, because most of the source is written in
>>>>         another language (Shaderlab) which includes HLSL snippets.
>>>>         GitHub classifies this as HLSL.
>>>>   * Microsoft/Windows-universal-samples
>>>>   * OGRECave/ogre
>>>>   * EpicGames/UnrealEngine
>>>>       o Of limited use, because most of the source is written in
>>>>         another language (Unreal Shader Format) which includes HLSL
>>>>         snippets. GitHub classifies this as HLSL.
>>>>   * ConfettiFX/The-Forge
>>>>   * AtomicGameEngine/AtomicGameEngine
>>>>   * NVIDIAGameWorks/D3DSamples
>>>>   * EpicGames/UnrealTournament
>>>>       o Of limited use, because most of the source is written in
>>>>         another language (Unreal Shader Format) which includes HLSL
>>>>         snippets. GitHub classifies this as HLSL.
>>>>   * urho3d/Urho3D
>>>>   * NVIDIAGameWorks/HairWorks
>>>>   * NVIDIAGameWorks/WaveWorks
>>>>       o Of limited use, because most of the source is written in
>>>>         another language (Effects) which includes HLSL snippets.
>>>>         GitHub classifies this as HLSL.
>>>>   * NVIDIAGameWorks/FleX
>>>>   * Unity-Technologies/PostProcessing
>>>>       o Of limited use, because most of the source is written in
>>>>         another language (Shaderlab) which includes HLSL snippets.
>>>>         GitHub classifies this as HLSL.
>>>>   * NVIDIAGameWorks/Falcor
>>>>   * NVIDIAGameWorks/FaceWorks
>>>>   * NVIDIAGameWorks/HBAOPlus
>>>>   * GPUOpen-LibrariesAndSDKs/GPUParticles11
>>>>   * NVIDIAGameWorks/VolumetricLighting
>>>>   * GPUOpen-Effects/ShadowFX
>>>>   * GPUOpen-LibrariesAndSDKs/LiquidVR
>>>>   * NVIDIAGameWorks/NvCloth
>>>>   * GPUOpen-Effects/DepthOfFieldFX
>>>>   * NVIDIAGameWorks/PhysX-3.4
>>>>   * GPUOpen-Effects/GeometryFX
>>>>   * GPUOpen-LibrariesAndSDKs/TiledLighting11
>>>>   * NVIDIAGameWorks/Flow
>>>>   * GPUOpen-LibrariesAndSDKs/ForwardPlus11
>>>>   * Microsoft/Win2D
>>>>   * GPUOpen-LibrariesAndSDKs/SSAA11
>>>>   * Microsoft/Win2D-Samples
>>>>   * PixarAnimationStudios/OpenSubdiv
>>>>
>>>> We could potentially figure out how to compile Effects, Shaderlab 
>>>> and Unreal Shader Format to HLSL (because that’s what their engines 
>>>> do). If we did this, we could grow the repository by 13% + 8% + 15% 
>>>> (respectively) = 36%. I didn’t want to get bogged down doing this, 
>>>> though.
>>>>
>>>> *Preprocessor*
>>>>
>>>> HLSL Source files make heavy use of the preprocessor. Each file 
>>>> includes an average of 9.61 uses of the preprocessor (lines that 
>>>> begin with “#”) and the preprocessor is used on average every 11.68 
>>>> lines.
>>>>
>>>> <Screen Shot 2018-11-13 at 1.03.53 PM.jpeg>
>>>>
>>>> As you can see above, most of the users of the preprocessor are not 
>>>> to include files, but are instead to enable / disable features. 
>>>> Therefore, this is a situation where compatibility with existing 
>>>> HLSL source is directly in conflict with simplicity of the language.
>>>>
>>>> I proceeded by running the corpus through the Microsoft HLSL 
>>>> preprocessor, and investigated the preprocessed files. My analysis 
>>>> is just based on the parsing stage of the language, not name 
>>>> resolution or type checking. Out-of-the-box, we parse 5.9% of the 
>>>> corpus.
>>>>
>>>> *Language Features*
>>>>
>>>> From investigating the source, I found some language features that 
>>>> HLSL depends on.
>>>>
>>>> In MSL, if you want to pass some data to your shader, you make a 
>>>> struct, and pass a reference to that struct as an argument of the 
>>>> main function. Then, in the main function, you reference the data 
>>>> by saying theReference.field. This approach is possible in HLSL, 
>>>> but there’s another more common way to do it. Instead of making a 
>>>> struct, you make a “cbuffer” which lists a set of fields, but those 
>>>> fields are treated as global variables. The cbuffer is given a 
>>>> “semantic” so the API can attach memory to back the cbuffer.
>>>>
>>>> cbuffer Camera : register(b0) // The API assigns memory to this 
>>>> block by using the “b0” handle
>>>> {
>>>>   float4x4 viewProjection;
>>>>   float4x4 projectionInv;
>>>>   float3 viewPos;
>>>> };
>>>>
>>>> Output main() {
>>>> output.foo = viewProjection; // viewProjection, projectionInv, and 
>>>> viewPos are in the global scope.
>>>> return output;
>>>> }
>>>>
>>>> About 1/3 of the files in the corpus use cbuffers.
>>>>
>>>> HLSL has two flavors of global variables:
>>>>
>>>>  1. Resources, like RWTexture2D<float2> dstTexture : register(u0);.
>>>>     These work just like entry point parameters, except they are in
>>>>     the global scope and therefore can be accessed from any
>>>>     function, without passing around a pointer to them.
>>>>  2. Literal data, like static const float convolutionWeights[] =
>>>>     {1, 2, 3};.
>>>>
>>>>
>>>> About 1/5 of the files in the corpus use global variables.
>>>>
>>>> HLSL supports default arguments in function parameters and 
>>>> cbuffers, so you can say void foo(int x = 3);. I would imagine 
>>>> specifying this would be tricky because we have to mention which 
>>>> variables and functions the initial value can refer to.
>>>>
>>>> Many files in the corpus use HLSL’s syntax for sampler literals, 
>>>> but those aren’t supported in SPIR-V, so I think we can safely 
>>>> ignore those. I don’t know what the SPIR-V Cross guys are doing 
>>>> about that.
>>>>
>>>> *New Syntax*
>>>>
>>>> There are lots of changes to the syntax of the language that 
>>>> shouldn’t have much of an effect on the language itself, but are 
>>>> required if we want to claim compatibility with lots of HLSL sources.
>>>>
>>>>   * Removing the entry point keywords (vertex, fragment, compute)
>>>>     is a requirement for any shader to compile. Instead, we should
>>>>     require that compilation of a WHLSL file state which function
>>>>     names are the entry points.
>>>>   * It appears that HLSL allows any arbitrary semantic for stage
>>>>     in/out parameters. Around 30% of the corpus uses a semantic
>>>>     that WHLSL doesn’t currently accept.
>>>>   * Some functions in the HLSL standard library use
>>>>     member-function-syntax, like texture.Sample(sampler, location)
>>>>     instead of Sample(texture, sampler, location).
>>>>   * There are a whole collection of variable modifiers that HLSL
>>>>     sources use that WHLSL doesn’t accept. E.g. row_major float4x4
>>>>     mvpMatrix;
>>>>   * HLSL has a few function modifiers like [ RootSignature(…stuff
>>>>     goes here…) ] void foo(…) { … } that are irrelevant for WebGPU.
>>>>     This includes information about the D3D root signature, but
>>>>     also things like how geometry shaders and tessellation work,
>>>>     which WebGPU doesn’t have.
>>>>   * HLSL arrays put the brackets after the variable name, like
>>>>     float myArray[40];
>>>>   * Arrays and structs can be initialized using brackets, like
>>>>     float myArray[3] = { 1.0, 2.0, 3.0 };
>>>>   * HLSL has some hints about how the compiler should treat
>>>>     branches and loops. They look like [ unroll ] for (…) and [
>>>>     branch ] if ( …). I don’t think these have semantic meaning so
>>>>     we can probably just swallow them.
>>>>   * HLSL uses C-style casts instead of C++-style casts. So, we need
>>>>     to support (float)x instead of float(x).
>>>>   * HLSL accepts float literals with exponents, like 1e-3.
>>>>   * Functions can be forward-declared in HLSL.
>>>>
>>>>
>>>> After doing all that, we get up to around 90% compatibility with 
>>>> parsing (not resolving names nor type checking) the HLSL corpus. 
>>>> The biggest wins are member-function syntax, allowing every 
>>>> semantic name, C-style casting, and C-style array syntax.
>>>>
>>>> —Myles
>>
Received on Wednesday, 28 November 2018 03:16:25 UTC