Re: [clipboard events] Pasting scenarios and the shape of clipboardData.getData(‘text/html’) return value

On Apr 9, 2014, at 1:58 PM, Ryosuke Niwa <rniwa@apple.com> wrote:
> On Apr 7, 2014, at 3:37 PM, Ben Peters <Ben.Peters@microsoft.com> wrote:
> 
>>>>>> After working with developers inside and outside Microsoft, it seems there are several types of paste that make sense in various scenarios. For instance, 
>>>>>> 1- if pasting into a rich document, it could be important to maintain source styling information. 
>>>>>> 2- When pasting into a wiki from an external source, it might make more sense to maintain destination styling instead. 
>>>>>> 3- When copying from a wiki and pasting back into that same wiki, it makes sense to maintain any special formatting on that text (inline styles) but otherwise to use the theme (style sheets). 
>> 
>> There is one other scenario here, which is to maintain the html shape, but not any styles.
>> 4- When seeking to maintain lists and tables, but format them with destination styles, it makes sense to remove style elements and style rules, but keep other html ( <li> and <table> for instance ).
> 
> Right, that's an important use case to address.
> 
>>>>>> One possibility would be to do something similar to Firefox, but also 
>>>>>> include a text/css clipboard item, which contains styles relevant to 
>>>>>> what is copied
>> 
>>>>> How hard do you think this is to implement?
>> 
>>>> Thanks for the code sample and thoughts! I'll run it by a few more 
>>>> developers to get deeper insight and get back to you.
>> 
>>> Great! Note that the code samples are just to get us started thinking about the issues we'll have to tackle if we're going to do this - if some other behaviour (say, 
>>> creating new class names and making up a new style sheet with generated/computed styles) is easier to implement or seems to make more sense by all means 
>>> suggest that other behaviour instead.
>> 
>> In order to support the 4 scenarios I mentioned above, we need to be able to distinguish inline css from style sheets. Your idea here about creating a new style sheet seems like a good way to go since it helps solve the selectors problem where css doesn't work the same once you remove the context by copying a section out, and it keeps the inline styles separate from the style sheets. We could include this styles in the head of the document or in a new text/css item.
>> 
>> On copy, we would take something like Chrome's algorithm to get relevant css for each element. For top-level elements, this would mean several rules by default to 'reset' the style, and anything other relevant styles. We would create a new class for each unique set of computed styles and give it a name that can be recognized and unique, maybe "copiedStyle_<randomid>" where <randomid> is a guid or similar. We would also remove any inline style elements like Chrome/Firefox already do. So on copy you would get something like this on the clipboard:
>> 
>> Version:0.9
>> StartHTML:0000000157
>> EndHTML:0000033333
>> StartFragment:0000011111
>> EndFragment:0000022222
>> SourceURL:http://en.wikipedia.org/wiki/Darth_vader
>> <html>
>> <head>
>> <style>
>> .copiedStyle_12345 {
>> 	color: black; background-image: none; font-weight: normal; margin: 0px 0px 0.25em; overflow: hidden; padding: 0px; border-bottom-width: 1px; border-bottom-style: solid; border-bottom-color: rgb(170, 170, 170); font-size: 1.8em; line-height: 1.3; font-family: 'Linux Libertine', Georgia, Times, serif; font-style: normal; font-variant: normal; letter-spacing: normal; orphans: auto; text-align: start; text-indent: 0px; text-transform: none; white-space: normal; widows: auto; word-spacing: 0px; -webkit-text-stroke-width: 0px; background-position: initial initial; background-repeat: initial initial;
>> }
>> </style>
>> </head>
>> <body>
>> <!--StartFragment--><h1 id="firstHeading" class="copiedStyle_12345 firstHeading" lang="en"><span dir="auto">Darth Vader</span></h1><!--EndFragment-->
>> </body>
>> </html>
> 
> Somewhat tricky issue here is that when this content is pasted into some page, that page may also have other CSS rules defined.  Depending on selectors they use, they might have a higher precedence than the single class name we use in the copied content.  We could add !important to each property but that could cause an issue if the pasted content is later edited, say, inside a contenteditable region.

Also on Mac, there is no <!--StartFragment--> and <!--EndFragment--> and the serialized markup copied into the clipboard (called pasteboard on Mac) needs to contain the precisely the markup that got copied by the user.

It has a few implications but one of which is that we need to serialize some semantic elements such as "a" and "h1" when a part of content inside such an element is selected because we don't want to simply copy the content with blue text and underline for "a" for example.  User expects the pasted content to be a functional hyperlink if it looks like an anchor.

Even elements such as "b" may need to be treated special because inside a contenteditable region where styleWithCSS is false, we don't want copying and pasting the content already in the contenteditable to introduce inline styles or a new style element.

There are other problems with more exotic features of HTML and CSS.  Another problem we recently found is that when the copied content contains position: fixed or position: sticky, we need to convert them to position: absolute and wrap the whole copied content with a position: relative box in order to prevent the pasted content to populate the paste destination.

In general, it is my opinion that copy algorithm should be spec'ed at the same time as paste algorithm in the HTML Editing API, and both of them are extremely challenging task.

- R. Niwa

Received on Thursday, 10 April 2014 16:36:17 UTC