Re: [w3c/clipboard-apis] Make async clipboard APIs (read/write) to sanitize interoperably with setData/getData for text/html (#150)

> Tagging @whsieh... we have a follow-up meeting 9/24/21 for the [Web Editing WG](https://github.com/w3c/editing). I wonder if you could help bring forward the list of the specific concerns and constraints mentioned above?

If I understand correctly, the issue is really that the process of creating a "sanitized copy" is fraught with complications — I suppose these three details immediately stand out to me, though there are likely more nuances:

1. As discussed in the TF meeting, when sanitizing markup, WebKit loads the markup in a separate (offscreen) page and browsing context, and then serializes the loaded result into markup. This page that we use for sanitization is special, in that we forbid any script execution, but still allow `script` tags to be parsed as if they were going to be executed. This discrepancy is necessary in order to ensure that the page cannot craft a payload that is deemed "safe" when loaded in a browser that disables script, but is unsafe when loaded in a browser that enables script. I'm not sure this behavior is something that can or should be specified.

2. For compatability with older versions of Microsoft Office, we _may_ preserve attributes on the `html` element in a narrow case where it contains the text `xmlns:o="urn:schemas-microsoft-com:office:office"`. Would the specification allow for user agents to selectively preserve content like this?

3. The process of serializing "visible content" in the page we use for sanitization is also pretty difficult to (exactly) specify, since we rely on editing code in WebKit that determines which DOM positions are "visible" to the user (and, importantly, visually distinct from other such DOM positions) to figure out the range in the sanitized page that we should include in the final sanitized markup. For instance, if we're sanitizing `<div><div>Hello</div></div>`, we won't attempt to preserve the fact that there are nested `div` elements, since the first user-visible position is right before the `"H"` in the inner text node.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/w3c/clipboard-apis/issues/150#issuecomment-922158278

Received on Saturday, 18 September 2021 01:47:42 UTC