Re: [whatwg/encoding] TextEncoder and TextDecoder performance concern around all libraries / runtimes (Issue #343)

WebReflection left a comment (whatwg/encoding#343)

> there was only slowness observed for shorter inputs

define **only** ... that's the most common and world-wide spread use case when serializing objects, where each *key* length, even in *Java*, rarely is bigger than 64 chars (*cbor-x*) and even beyond that, all those libraries adopt strategies to **decode** faster by grouping binary data via their *byteLength* as serialized string to then optimize and *cache* all over the place everything that has already been *encoded* or *decoded*, with the lovely caveat that *strings* are not observable by *FinalizationRegistry* as these are primitives, so that most libraries bloat in RAM and can't predict *strings* usage across programs by any mean.

All benchmarks in those repositories or posts show that everyone needs to avoid *TextEncoder* or *TextDecoder* as repeatedly slow and all performance show that these are slow indeed compared to manual hand-crafting of buffered strings.

Moreover, *cbor-x* uses a `/\u0080-\uFFFF/` regexp to instantly bail out of *TextEncoder* and you can read developers complaining about this encoding being slow ... 

Accordingly, what do you need to see that encoding and decoding via these API is slow? A code-pen? Some basic snippet that uses `console.time` or ... what? Happy to provide what you are after, just please don't try to hide the issue here, because for real-world use cases those APIs slow on small inputs **are** the issue, not the feature.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/encoding/issues/343#issuecomment-2725224470
You are receiving this because you are subscribed to this thread.

Message ID: <whatwg/encoding/issues/343/2725224470@github.com>

Received on Friday, 14 March 2025 16:47:19 UTC