Re: [whatwg/encoding] TextEncoder#encode - write to existing Uint8Array (#69) from Henri Sivonen on 2018-11-02 (public-webapps-github@w3.org from November 2018)

From: Henri Sivonen <notifications@github.com>
Date: Fri, 02 Nov 2018 05:31:41 -0700
To: whatwg/encoding <encoding@noreply.github.com>
Cc: Subscribed <subscribed@noreply.github.com>
Message-ID: <whatwg/encoding/issues/69/435363269@github.com>

Yesterday, I had my mind too much in the mode of supporting all encoders even though only the UTF-8 encoder would be exposed to JS/Wasm. Sorry.

> ```webidl
> bool inputEmpty; // true if input was exhausted. false if more output space is needed
> ```

In the encode case, that bit can be inferred from `read` when encode to ISO-2022-JP is not supported.

The encoding_rs APIs I linked to in my previous comments are for streaming. The Wasm argument conversion case is closer to what Gecko's in-memory XPCOM string conversions need.

For in-memory conversions, encoding_rs provides these simplified APIs:

* [Convert potentially-invalid UTF-16 to UTF-8 with worst-case-sized caller-allocated output buffer](https://docs.rs/encoding_rs/0.8.10/encoding_rs/mem/fn.convert_utf16_to_utf8.html)
* [Convert the start of a potentially-invalid UTF-16 string to UTF-8 into a potentially too short caller-allocated buffer](https://docs.rs/encoding_rs/0.8.10/encoding_rs/mem/fn.convert_utf16_to_str_partial.html)

When the latter is fails to convert the whole string, a reallocation plus use of the first function is needed. In Gecko, the potentially too short buffer is allocated by rounding the best case up to the allocator bucket size (leaky abstraction).

Going back to the issue of string view or start index into a string: Do we need that or can we trust the JS substring operation not to copy the underlying buffer?

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/encoding/issues/69#issuecomment-435363269

Received on Friday, 2 November 2018 12:39:10 UTC