Re: [whatwg/encoding] New TextDecoder.decode() API needed for Wasm multithreading (#172) from Henri Sivonen on 2019-02-11 (public-webapps-github@w3.org from February 2019)

From: Henri Sivonen <notifications@github.com>
Date: Mon, 11 Feb 2019 00:43:00 -0800
To: whatwg/encoding <encoding@noreply.github.com>
Cc: Subscribed <subscribed@noreply.github.com>
Message-ID: <whatwg/encoding/issues/172/462251961@github.com>

> but it needs to realize the bytes in the buffer can change at any moment.

So basically all DOM code operating on `SharedArrayBuffer` is in the territory of "we're reasoning about Undefined Behavior"?

In Gecko, `TextDecoder.decode()` can read the same memory location twice without synchronization: first a wider read for ASCIIness check and if the ASCIIness check failed, then as narrower reads. The narrower reads have their own bound checks. However, if a sufficiently smart compiler figured out that the ASCIIness checking failing means that the tail loop has to terminate by finding a non-ASCII byte, the compiler could eliminate the bound check from the tail loop. Then if the memory changed from non-ASCII to ASCII between the ASCIIness-check wide read and the narrow-read tail loop, the tail loop would read out of bounds...

Are we _really_ relying on the compilers we use to write DOM-side code not optimizing too hard on the assumption that memory doesn't change from underneath us in cases where it would be UB for it to change underneath us?

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/encoding/issues/172#issuecomment-462251961

Received on Monday, 11 February 2019 08:43:22 UTC