Re: [w3c/clipboard-apis] Write UTF-8 data to the clipboard. (Issue #217)

The Web Editing Working Group just discussed `Write UTF-8 data to the clipboard.`, and agreed to the following:

* `RESOLVED: Remove the bullet about UTF-8 encoding. Anupam to file follow up issue to investigate what happens when you try to send invalid UTF chars though.`

<details><summary>The full IRC log of that discussion</summary>
&lt;dandclark> topic: Write UTF-8 data to the clipboard.<br>
&lt;dandclark> github: https://github.com/w3c/clipboard-apis/issues/217<br>
&lt;dandclark> snianu: Recently we found in Chromium that when we copy svg (chromium supports img/svg), we switch encoding from utf-8 to utf-16<br>
&lt;dandclark> ...: When we paste in native apps like Word, the image doesn't render<br>
&lt;dandclark> ...: It's because the native apps expect utf-8<br>
&lt;dandclark> ...: We investigated, found in the spec that when we write blobs to system clipboard, spec says use utf-8 decoder, write scalar values to system clipboard<br>
&lt;dandclark> ...: Trying to get feedback on whether to change the spec<br>
&lt;dandclark> ...: Or are there corner cases we're missing like for PNG<br>
&lt;dandclark> smaug: I think what Anne noticed is a clear bug<br>
&lt;dandclark> snianu: Is there a specific encoding rule that FF or Safari follow when writing formats? Or is it whatever encoding is in the blob type?<br>
&lt;dandclark> smaug: I can't recall<br>
&lt;dandclark> ...: E.g. if your OS has image-specific backing store you do some additional transformation<br>
&lt;dandclark> snianu: Agree. I read in Apple documentation it's default UTF-16 but can use others<br>
&lt;dandclark> ...: Agree for images it doesn't make sense , for other MIME types like svg and HTML, does it make sense to write UTF-8?<br>
&lt;dandclark> ...: Windows has separate APIs for UTF and ASCII characters<br>
&lt;dandclark> ...: I think there's lots of different cases and encoding schemes<br>
&lt;dandclark> ...: Don't know if makes sense to standardize it<br>
&lt;dandclark> ...: Because it's also platform specific<br>
&lt;dandclark> anne: The one thing you could maybe do is abstract between text and byte sequence types<br>
&lt;dandclark> ...: For text sequence types, always do UTF pass so you always get scalar values<br>
&lt;dandclark> ...: Is interesting question what platforms currently do. If you put zero-bytes in text stream, do you get zero-bytes or replacement chars?<br>
&lt;dandclark> snianu: For the existing spec text, do we all agree it's not valid and we should remove it?<br>
&lt;dandclark> ...: And may be do investigation to see what can be added to the spec, maybe as a note?<br>
&lt;dandclark> anne: Reasonable to remove UTF-8 step and then investigate<br>
&lt;dandclark> smaug: Might be useful to see why we have the UTF-8 thing in the spec<br>
&lt;dandclark> anne: Good to do blame analysis, I didn't yet<br>
&lt;dandclark> smaug: It's very specific, might be something interesting mentioned in spec issue somewhere<br>
&lt;dandclark> johanneswilm: Is there agreement?<br>
&lt;dandclark> johanneswilm: It's always either bytes or UTF-8? Any risk of other older encodings?<br>
&lt;dandclark> anne: It's another interesting question. It's why I think bytes are the answer and we need to investigate further.<br>
&lt;dandclark> johanneswilm: Who will  file follow up issue?<br>
&lt;dandclark> snianu: I can<br>
&lt;dandclark> RESOLVED: Remove the bullet about UTF-8 encoding. Anupam to file follow up issue to investigate what happens when you try to send invalid UTF chars though.<br>
</details>


-- 
Reply to this email directly or view it on GitHub:
https://github.com/w3c/clipboard-apis/issues/217#issuecomment-2165970625
You are receiving this because you are subscribed to this thread.

Message ID: <w3c/clipboard-apis/issues/217/2165970625@github.com>

Received on Thursday, 13 June 2024 15:14:05 UTC