[whatwg/encoding] Fast byteLength() (Issue #333) from Jamie Kyle on 2024-07-25 (public-webapps-github@w3.org from July 2024)

From: Jamie Kyle <notifications@github.com>
Date: Thu, 25 Jul 2024 16:59:12 -0700
To: whatwg/encoding <encoding@noreply.github.com>
Cc: Subscribed <subscribed@noreply.github.com>
Message-ID: <whatwg/encoding/issues/333@github.com>

### What problem are you trying to solve?

`new TextEncoder().encode(input).byteLength` is an order of magnitude slower than alternatives, including Node's `Buffer.byteLength(input)` and even handwritten JavaScript implementations.

[Benchmarks](https://github.com/jamiebuilds/bytelength-benchmarks)

```
./benchmarks/blob.js:              202’345.0 ops/sec (± 13’993.9, p=0.001, o=0/100)
./benchmarks/buffer.js:         57’434’701.2 ops/sec (±425’763.3, p=0.001, o=9/100) severe outliers=5
./benchmarks/implementation.js: 48’441’909.6 ops/sec (±397’249.6, p=0.001, o=5/100) severe outliers=2
./benchmarks/textencoder.js:     2’667’052.4 ops/sec (±564’727.5, p=0.001, o=6/100) severe outliers=2
```

My benchmark repo includes a JS implementation that I believe is at least close enough to correct for benchmarking purposes, although I'm no expert in UTF-16 so there may be some mistakes.





### What solutions exist today?

- `new Blob([input]).size`
- `new TextEncoder(input).byteLength`
- `Buffer.byteLength(input)` (Node only)
- Implementations in JS

### How would you solve it?

```js
let encoder = new TextEncoder()
let byteLength = encoder.byteLength("Hello, World!")
// >> 13
```

### Anything else?

_No response_

-- 
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/encoding/issues/333
You are receiving this because you are subscribed to this thread.

Message ID: <whatwg/encoding/issues/333@github.com>

Received on Thursday, 25 July 2024 23:59:16 UTC