Re: WebCL validator

The validator is showing what needs to be validated and it shouldn't be thought as the optimal way of doing it.

Indeed, buffer ranges and zero initialization are critical as one cannot expect drivers or hardware to do these checks and thus crash machines often badly. Some of these checks should also be done at runtime and unfortunately there is a penalty. Again, beware that the implementation isn't optimal and the penalty should be much less with production quality code.

--mike

________________________________
From: Corentin Wallez <cwallez@google.com>
Sent: Friday, August 25, 2017 8:46:17 AM
To: Kenneth Russell
Cc: public-gpu@w3.org
Subject: Re: WebCL validator

Thanks Ken for the pointers! (not a pun)

It is interesting that it chose to not use fat pointers (as in {min, max, current}). Instead they pass a structure containing min / max for all buffers and clamp each memory access. It seems to rely on statically knowing which buffers pointers come from, and it is unclear how pointer selection is handled.

The 1.5-3x number is a bit worrying, we should avoid that cost if possible.

Corentin

On Thu, Aug 24, 2017 at 2:23 PM, Kenneth Russell <kbr@google.com<mailto:kbr@google.com>> wrote:
A few years ago the WebCL working group created a validator which ensured that OpenCL kernels loaded into browsers couldn't access memory they weren't supposed to be able to. The project was open-source and is available here:

https://github.com/KhronosGroup/webcl-validator

There's some interesting information about the overall approach in the design writeup:

https://github.com/KhronosGroup/webcl-validator/blob/master/DESIGN.txt

Here are some kernels before and after transformation by the validator:

http://wolfviking0.github.io/webcl-translator/

There's also a nice presentation on the validator with some examples, including timing measurements:

http://learningwebcl.com/2013/11/webcl-kernel-validator-explained/
http://learningwebcl.com/wp-content/uploads/2013/11/WebCLMemoryProtection.pdf

I remembered the overhead incorrectly. The 10% number I mentioned on yesterday's conference call was the cost of zero-initializing all buffers in WebCL, which was a different project than this one. The validator has significantly more overhead; kernels run on the order of 1.5x - 3x slower, because of needing to range check all pointer accesses against multiple memory regions.

Let's study and learn from this previous work. WebGPU should ideally enforce constraints on its compute kernels to reduce the overhead of any run-time checks that are needed.

-Ken

Received on Friday, 25 August 2017 16:27:33 UTC