- From: Kai Ninomiya <kainino@google.com>
- Date: Wed, 28 Feb 2018 22:28:36 +0000
- To: Maciej Stachowiak <mjs@apple.com>
- Cc: Dean Jackson <dino@apple.com>, "Myles C. Maxfield" <mmaxfield@apple.com>, public-gpu <public-gpu@w3.org>, Bradley Nelson <bradnelson@google.com>
- Message-ID: <CANxMeyDshXQUJ_YquQzV0myoRYBSYDvUFtoh+HbNwFVd0m5m_A@mail.gmail.com>
Ah, thanks for the perspective. I bet we really do have something similar that's being hidden behind the C++ magic. I will need to ask more knowledgable folks on our side. On Wed, Feb 28, 2018 at 1:51 PM Maciej Stachowiak <mjs@apple.com> wrote: > > The way this works in WebKit is that methods in bound C++ classes can take > WTF::AtomicString parameters. AtomicString is a class that uniques/interns > strings in a global table and offers cheap equality comparison. JSC interns > all constant strings in source in a way that is compatible with > AtomicString. The same AtomicString passes all the way from the JS engine > to the bindings. So in the common case, strings for string enums can be > compared with a single pointer compare instruction inside C++ code > implementing the binding. > > (I believe we also optimize very short strings to be stored in the > pointer, but that is the less critical optimization in this case). > > All JS engines and browser engines should be able to implement this same > optimization. > > > On Feb 28, 2018, at 12:18 PM, Kai Ninomiya <kainino@google.com> wrote: > > > Since every modern Web API uses strings for enums, it's worth doing. > Just wanted to point out that, while this is certainly nice, even real > string comparisons are not THAT slow (most of the APIs I could find seem to > do them very rarely - like in object creation). So from a performance > perspective I think WebGPU has much, much (100x~10000x) tighter performance > requirements on this particular optimization than any existing API (that I > know of). > > On Wed, Feb 28, 2018 at 12:11 PM Kai Ninomiya <kainino@google.com> wrote: > >> Okay, great to know. >> Looking back at our bindings, there's a lot of C++ magic going on here - >> it's very hard to tell whether the comparisons could be fast, partly >> because AFAICT V8 strings do not actually make it into Blink before being >> converted to WTF::String. (Specifically, it depends on whether WTF::String >> has fast comparisons and whether there's a fast conversion of short strings >> from V8 to WTF.) >> >> BTW, depending on how much optimization we can rely on, the small-string >> optimization probably goes up to length 15, and can go up to length 22 >> (which clang's libc++ does). >> >> On Wed, Feb 28, 2018 at 12:01 PM Dean Jackson <dino@apple.com> wrote: >> >>> >>> On 28 Feb 2018, at 11:55, Kai Ninomiya <kainino@google.com> wrote: >>> >>> >>> On Wed, Feb 28, 2018 at 11:40 AM Myles C. Maxfield <mmaxfield@apple.com> >>> wrote: >>> >>>> These strings are English words, right? Most of which are fewer than 8 >>>> characters and can therefore fit in a 64-bit register? >>>> >>> Sure. This forces us to make our enums short, but that's probably doable. >>> Thanks for pointing out this optimization; it had completely slipped my >>> mind. It certainly makes the problem much more tractable than with interned >>> strings. >>> >>> JavaScriptCore turns JS string literals into AtomicStrings at parse >>>> time, which means they can be compared for equality in a single >>>> instruction. The performance problem you mention is an implementation >>>> detail of V8, and would best be fixed by improving V8, rather than forcing >>>> every Web developer to learn a new style of API. >>>> >>> The string comparison is NOT inside the JIT - it's in the bindings. Does >>> JavaScriptCore let you do this in the bindings? V8 probably does this >>> optimization inside the JIT, but I don't know if it's possible in bindings. >>> It may be the case that V8 already exposes a string type to C++ which is >>> efficient to compare for short strings. I'm pretty certain that no such >>> optimization is being used in the bindings for the APIs I looked at. I did >>> not ask any V8 folks about this particular question so I may be missing >>> stuff. >>> >>> >>> We do exactly this - use a more efficient string type for comparison in >>> our bindings. Since every modern Web API uses strings for enums, it's worth >>> doing. >>> >>> I'm strongly in favour of using strings for enums types. >>> >>> Dean >>> >>> >>> I'm only using V8 as a representative engine here - we can't forget >>> there are at least 2 other JS engines that we care about >>> (SpiderMonkey/Chakra). >>> >>> >>>> You mentioned “critical path of the API (e.g. draw call).” Two out of >>>> the 3 platform APIs use draw calls which don’t accept any enums, and the >>>> remaining API only accepts a single enum. Indeed, even in WebGPU's sketch >>>> API, our own draw calls >>>> <https://github.com/gpuweb/gpuweb/blob/master/design/sketch.webidl#L444> don’t >>>> accept any enums. Other calls which accept enums are used to compile >>>> device-specific state, which is already a super expensive operation. >>>> >>> Sorry, I didn't mean the draw call in particular. I just meant "roughly >>> at the frequency of draw calls". >>> >>> There is, however, large benefit to being consistent with the rest of >>>> the Web Platform’s conventions. We shouldn’t be diverging from the existing >>>> best practices until we have concrete evidence describing what the >>>> performance benefit would be. (Premature optimization is the root of all >>>> evil.) >>>> >>> This is why we're having these discussions. I didn't mean to shut down >>> the idea out of hand. >>> >>> >>>> On Feb 26, 2018, at 5:45 PM, Kai Ninomiya <kainino@google.com> wrote: >>>> >>>> Hey folks, >>>> >>>> I had a brief chat with Brad Nelson (cc'd) (WASM chair) last week about >>>> string enums. Here's my summary (Brad: if you wouldn't mind, correct me if >>>> I've misrepresented anything). >>>> >>>> * Probably possible (though difficult) to make them somewhat fast (~as >>>> fast as JS is now), and likely to be valuable for other Web APIs too. >>>> * BUT: it may be hard to make them fast enough to rely on them in the >>>> critical path of the API (e.g. draw call). I investigated this a bit more: >>>> >>>> Taking a look at Blink code for some APIs that use IDL enums (like >>>> WebAudio and 2D Canvas), I determined, at least, that these entry points >>>> are not doing anything special when they take in IDL enums: They just >>>> receive a C++ string object, and convert to a C++ enum via a helper >>>> function that does string comparisons. >>>> (It's possible that string enum comparisons are fast in JITed JS (e.g. >>>> webapi.getThing() == "variant2"), but they are not currently fast going out >>>> to native code.) >>>> >>>> If I'm understanding correctly, solving this problem (just for JS) >>>> would require a whole new concept to be added throughout the stack - from >>>> the JS engine (V8) API to the bindings code to Blink - to be able to >>>> recompile string enums from the app's JS source into some kind of integer >>>> ID that can be compared efficiently in Blink. (Blink would have to be able >>>> to query this integer ID from the V8 VM as well.) >>>> >>>> And that's ignoring WebAssembly. For WebAssembly, we would need some >>>> way for it to - at load time - either (a) like Blink, query the integer ID >>>> for a given string, or (b) create a table from i32s to strings which, when >>>> used in Host Bindings, are guaranteed to get efficiently turned into the >>>> integer IDs on the way into Blink. >>>> >>>> All in all, this looks like a huge project, and I definitely feel we >>>> cannot rely on it when designing an API for WebGPU (all assuming I'm not >>>> missing something important). >>>> >>>>
Attachments
- application/pkcs7-signature attachment: S/MIME Cryptographic Signature
Received on Wednesday, 28 February 2018 22:29:15 UTC