Re: [w3c/editing] Seeking feedback on Clipboard Pickling APIs. (#334)

> > Again, I'm getting really confused by mixing up reading & writing of system pasteboard content. What problem exactly are we solving by explicitly requesting unsanitized format for reading or writing pasteboard content in a website / webapp. Please define the threat model of each scenario separately, and explain why an explicit request for unsanitized content is required.
> 
> If we make the process of reading/writing unsanitized content more explicit, then there is an implicit expectation that the developers are aware of the security implications and would have mitigations in place(e.g use intense fuzzing, such as provided by [OSSFuzz](https://google.github.io/oss-fuzz/)). Quoting @dway123 here as I think this response captures lot of details about why we are proposing the `unsanitized` option:
> 
> > Requiring sites to provide the unsanitized list allows sites to support both an unsanitized (for reading by the site) and sanitized (for reading by other sites/native apps) version of the same payload, in case a site requires information removed by sanitization. Therefore, writing unsanitized data by default (rather than via the unsanitized list) would also make addition of sanitized payloads by browser implementations be web-incompatible, as previously unsanitized content would become sanitized, potentially wiping metadata a site relies on.

I really don't follow. When we say developers, are we talking about web developers, or native app developers? Surely, browsers should have to write both versions to the system pasteboard because we don't know when or if the user pastes the content to another browser instance of the same origin or to some other native applications. So it doesn't seem like there is an option left for web developers to say, I want to only write a version of content that didn't go through sanitization process.

> In Chromium implementation at least, when clipboard read method is called, we query all the standard formats from the clipboard that are supported by the Browser. This would be very expensive if we have to read all custom formats as well even if the sites haven't requested for any custom formats. `unsanitized` option gives Browsers the flexibility to decide if they want to read any custom formats along with standard MIME types(such as text/html, text/plain, image/png etc) when web deveopers call `navigator.clipboard.read`.

I'm really confused here. Why would a website want to read the sanitized version of content from the system pasteboard if a version of the content that's unsanitized is available to them? Is the concern that we want to make sure we don't end up giving them potentially dangerous content? That doesn't seem like a kind of assumption websites should be making in the first place. There is nothing browsers can do to ensure that whatever content read from the system pasteboard won't result in some kind of XSS or even remote server exploits since we have no idea how a website is processing it. e.g. a plain text in the pasteboard could result in XSS if it's inserted inside a script or style tag or some attributes.

> > Why would the browser need to read both versions without this option?
> 
> Well, in clipboard read, we use the `ClipboardItem` that only takes MIME types as input. How would you know if the site is requesting unsanitized content for HTML format?
> e.g.
> 
> ```
> const clipboardItems = await navigator.clipboard.read();
> const clipboardItem = clipboardItems[0];
> const htmlBlob = await clipboardItem.getType('text/html'); // Should this return sanitized or unsanitized HTML content?
> ```

I don't see the need for reading the sanitized version. What is the scenario in which a website wants to read unsanitized version of the content?

>  > When the user copies something on a website, the website shouldn't be in control of whether a given MIME type should be exposed to another app or not.
> 
> Well, async clipboard write method gives complete control to web authors as to what content should be written to the clipboard. This is being achieved by providing the MIME types in `ClipboardItem`, so I'm not sure what the concern here is exactly. Default copy operation (using execCommand or copy command when user presses ctrl+v) writes all the supported/applicable formats to the clipboard based on the selected content so this process is completely different than async clipboard read/write APIs.

Right, standard ones. But websites shouldn't be in control of, say, exposing a PSD file unsanitized. Similarly, if a website writes HTML, then the browser needs to provide both sanitized HTML and unsanitized HTML for other browsers and native apps because we don't know at the time of writing to the system pasteboard what the receiver is capable of.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/w3c/editing/issues/334#issuecomment-909543382

Received on Tuesday, 31 August 2021 19:31:17 UTC