Re: [w3c/FileAPI] Add option to Blob constructor to skip UTF-8 encoding (#102)

More precisely, the request is for an option that treats input strings as ByteString rather than USVString. Since strings are sequences of 16-bit code units, they need to be interpreted *somehow* to become bytes, whether it's interpreting as UTF-16 and transcoding to UTF-8 (as is done today), storing as UTF-16LE or UTF-16BE (plausible, but let's not), or truncating to 8-bit values (per the request). Presumably this would follow the behavior for ByteString and throw on code units > 0xFF.

Seems reasonable. But on the other hand, why are libraries putting binary data in strings in the first place? Are these mostly older libraries that predate ArrayBuffer? Old APIs like `atob()` ? Should we extend the web platform just to support old libraries, especially when this can easily be worked around in userspace? (`new Uint8Array(string.split('').map(c=>c.charCodeAt(0)))`)


-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/w3c/FileAPI/issues/102#issuecomment-397690667

Received on Friday, 15 June 2018 17:31:15 UTC