Re: [whatwg/fetch] Should Body.formData() always strip the BOM? (#650)

> Generally I think BOM stripping is good and we definitely want it for JSON (consistency with JSON consumed elsewhere) and text (same consistency). I guess arguably that means we should do it here too since it's a text format.

Do you mean stripping the BOM from each name/value pair parsed by the application/x-www-form-urlencoded parser or just the leading one?

> It does sound though like the code in Blink is not super efficient as you end up decoding, then encoding so you can do percent decoding, and then decoding the result of that. Instead of decoding the result of percent decoding as the specification does now.

Not sure about the first decoding bit. Right now, we do something like this for this specific case of a string containing `\uFEFF` (I'm writing it down here mostly so I can come back to this later):
* The test is read, the JS string parsed as a USVString and we have an UTF-16 string
* The UTF-16 string is encoded to UTF-8
* The entire UTF-8 string is decoded without the BOM into a UTF-16 string
* The UTF-16 string is passed to https://url.spec.whatwg.org/#urlencoded-parsing, with each `name`/`value` pair being separately re-encoded into UTF-8 and percent-decoded into UTF-16 strings
* The list created in the previous step is fed into `FormData`, which re-encodes everything into UTF-8

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/fetch/issues/650#issuecomment-353095386

Received on Wednesday, 20 December 2017 15:34:20 UTC