Re: About IDL's string enums

The way this works in WebKit is that methods in bound C++ classes can take WTF::AtomicString parameters. AtomicString is a class that uniques/interns strings in a global table and offers cheap equality comparison. JSC interns all constant strings in source in a way that is compatible with AtomicString. The same AtomicString passes all the way from the JS engine to the bindings. So in the common case, strings for string enums can be compared with a single pointer compare instruction inside C++ code implementing the binding.

(I believe we also optimize very short strings to be stored in the pointer, but that is the less critical optimization in this case).

All JS engines and browser engines should be able to implement this same optimization.

> On Feb 28, 2018, at 12:18 PM, Kai Ninomiya <kainino@google.com> wrote:
> 
> > Since every modern Web API uses strings for enums, it's worth doing.
> Just wanted to point out that, while this is certainly nice, even real string comparisons are not THAT slow (most of the APIs I could find seem to do them very rarely - like in object creation). So from a performance perspective I think WebGPU has much, much (100x~10000x) tighter performance requirements on this particular optimization than any existing API (that I know of).
> 
> On Wed, Feb 28, 2018 at 12:11 PM Kai Ninomiya <kainino@google.com <mailto:kainino@google.com>> wrote:
> Okay, great to know.
> Looking back at our bindings, there's a lot of C++ magic going on here - it's very hard to tell whether the comparisons could be fast, partly because AFAICT V8 strings do not actually make it into Blink before being converted to WTF::String. (Specifically, it depends on whether WTF::String has fast comparisons and whether there's a fast conversion of short strings from V8 to WTF.)
> 
> BTW, depending on how much optimization we can rely on, the small-string optimization probably goes up to length 15, and can go up to length 22 (which clang's libc++ does).
> 
> On Wed, Feb 28, 2018 at 12:01 PM Dean Jackson <dino@apple.com <mailto:dino@apple.com>> wrote:
> 
>> On 28 Feb 2018, at 11:55, Kai Ninomiya <kainino@google.com <mailto:kainino@google.com>> wrote:
>> 
>> 
>> On Wed, Feb 28, 2018 at 11:40 AM Myles C. Maxfield <mmaxfield@apple.com <mailto:mmaxfield@apple.com>> wrote:
>> These strings are English words, right? Most of which are fewer than 8 characters and can therefore fit in a 64-bit register?
>> Sure. This forces us to make our enums short, but that's probably doable.
>> Thanks for pointing out this optimization; it had completely slipped my mind. It certainly makes the problem much more tractable than with interned strings.
>> 
>> JavaScriptCore turns JS string literals into AtomicStrings at parse time, which means they can be compared for equality in a single instruction. The performance problem you mention is an implementation detail of V8, and would best be fixed by improving V8, rather than forcing every Web developer to learn a new style of API.
>> The string comparison is NOT inside the JIT - it's in the bindings. Does JavaScriptCore let you do this in the bindings? V8 probably does this optimization inside the JIT, but I don't know if it's possible in bindings. It may be the case that V8 already exposes a string type to C++ which is efficient to compare for short strings. I'm pretty certain that no such optimization is being used in the bindings for the APIs I looked at. I did not ask any V8 folks about this particular question so I may be missing stuff.
> 
> We do exactly this - use a more efficient string type for comparison in our bindings. Since every modern Web API uses strings for enums, it's worth doing.
> 
> I'm strongly in favour of using strings for enums types.
> 
> Dean
> 
>> 
>> I'm only using V8 as a representative engine here - we can't forget there are at least 2 other JS engines that we care about (SpiderMonkey/Chakra).
>>  
>> You mentioned “critical path of the API (e.g. draw call).” Two out of the 3 platform APIs use draw calls which don’t accept any enums, and the remaining API only accepts a single enum. Indeed, even in WebGPU's sketch API, our own draw calls <https://github.com/gpuweb/gpuweb/blob/master/design/sketch.webidl#L444> don’t accept any enums. Other calls which accept enums are used to compile device-specific state, which is already a super expensive operation.
>> Sorry, I didn't mean the draw call in particular. I just meant "roughly at the frequency of draw calls".
>> 
>> There is, however, large benefit to being consistent with the rest of the Web Platform’s conventions. We shouldn’t be diverging from the existing best practices until we have concrete evidence describing what the performance benefit would be. (Premature optimization is the root of all evil.)
>> This is why we're having these discussions. I didn't mean to shut down the idea out of hand.
>>  
>> On Feb 26, 2018, at 5:45 PM, Kai Ninomiya <kainino@google.com <mailto:kainino@google.com>> wrote:
>> 
>>> Hey folks,
>>> 
>>> I had a brief chat with Brad Nelson (cc'd) (WASM chair) last week about string enums. Here's my summary (Brad: if you wouldn't mind, correct me if I've misrepresented anything).
>>> 
>>> * Probably possible (though difficult) to make them somewhat fast (~as fast as JS is now), and likely to be valuable for other Web APIs too.
>>> * BUT: it may be hard to make them fast enough to rely on them in the critical path of the API (e.g. draw call). I investigated this a bit more:
>>> 
>>> Taking a look at Blink code for some APIs that use IDL enums (like WebAudio and 2D Canvas), I determined, at least, that these entry points are not doing anything special when they take in IDL enums: They just receive a C++ string object, and convert to a C++ enum via a helper function that does string comparisons.
>>> (It's possible that string enum comparisons are fast in JITed JS (e.g. webapi.getThing() == "variant2"), but they are not currently fast going out to native code.)
>>> 
>>> If I'm understanding correctly, solving this problem (just for JS) would require a whole new concept to be added throughout the stack - from the JS engine (V8) API to the bindings code to Blink - to be able to recompile string enums from the app's JS source into some kind of integer ID that can be compared efficiently in Blink. (Blink would have to be able to query this integer ID from the V8 VM as well.)
>>> 
>>> And that's ignoring WebAssembly. For WebAssembly, we would need some way for it to - at load time - either (a) like Blink, query the integer ID for a given string, or (b) create a table from i32s to strings which, when used in Host Bindings, are guaranteed to get efficiently turned into the integer IDs on the way into Blink.
>>> 
>>> All in all, this looks like a huge project, and I definitely feel we cannot rely on it when designing an API for WebGPU (all assuming I'm not missing something important).

Received on Wednesday, 28 February 2018 21:51:41 UTC