Re: [w3c/clipboard-apis] Make async clipboard APIs (read/write) to sanitize interoperably with setData/getData for text/html (#150)

@annevk maybe I can clarify...

@snianu's goal is process markup like this: `some text` into: `<html><head></head><body>some text</body></html>` and also markup like this: 

```
<!DOCTYPE html>
<html dir="ltr" lang="en">
<head>
<meta http-equiv=Content-Type content="text/html; charset=utf-8">
<meta name=ProgId content=Excel.Sheet>
<meta name=Generator content="Microsoft Excel 15">
<style>
table
 {mso-displayed-decimal-separator:"\.";
 mso-displayed-thousand-separator:"\,";}
tr
 {mso-height-source:auto;}
col
 {mso-width-source:auto;}
td
 {padding-top:1px;
 padding-right:1px;
 padding-left:1px;
 mso-ignore:padding;
 color:black;
 font-size:11.0pt;
 font-weight:400;
 font-style:normal;
 text-decoration:none;
 font-family:Calibri, sans-serif;
 mso-font-charset:0;
 text-align:general;
 vertical-align:bottom;
 border:none;
 white-space:nowrap;
 mso-rotate:0;}
.xl16
 {color:black;
 font-family:Calibri;
 mso-generic-font-family:auto;
 mso-font-charset:0;
 background:#E7E6E6;
 mso-pattern:black none;}
.xl17
 {color:black;
 font-family:Calibri;
 mso-generic-font-family:auto;
 mso-font-charset:0;
 background:#D9E2F3;
 mso-pattern:black none;}
.xl18
 {color:black;
 font-family:Calibri;
 mso-generic-font-family:auto;
 mso-font-charset:0;
 background:#E2EFD9;
 mso-pattern:black none;}
</style>
</head>

<body link="#0563C1" vlink="#954F72">

<table width=192 style='border-collapse:collapse;width:144pt'>
<!--StartFragment-->
 <col width=64 style='width:48pt' span=3>
 <tr height=20 style='height:15.0pt'>
  <td width=64 height=20 class=xl16 style='width:48pt;height:15.0pt'>One</td>
  <td width=64 class=xl17 style='width:48pt'>Two</td>
  <td width=64 class=xl18 style='width:48pt'>Three</td>
 </tr>
 <tr height=20 style='height:15.0pt'>
  <td height=20 align=right style='height:15.0pt'>1</td>
  <td align=right>2</td>
  <td align=right>3</td>
 </tr>
<!--EndFragment-->
</table>
</body>
</html>
```
Into a DOM that represents all the content basically as written above including the attributes of the HTML element.

If we use the [fragment parser](https://html.spec.whatwg.org/multipage/parsing.html#parsing-html-fragments) we need to provide a context element.  Let me know if I'm mistaken but I don't see anything in the spec about how we could we can initialize the fragment parser so that it sets up the HTML parser in the [initial insertion mode](https://html.spec.whatwg.org/multipage/parsing.html#the-initial-insertion-mode).  So the best we could do with the fragment parser given the second markup example I provided is to use an `html` element as our context and then have the parser throw away the `DOCTYPE` and original `html` node along with its attributes.



-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/w3c/clipboard-apis/issues/150#issuecomment-917212886

Received on Friday, 10 September 2021 21:09:39 UTC