W3C home > Mailing lists > Public > www-tag@w3.org > June 2010

Re: Copy to Clipboard - ambush and abuse by javascript

From: Noah Mendelsohn <nrm@arcanedomain.com>
Date: Sat, 05 Jun 2010 14:34:55 -0400
Message-ID: <4C0A98CF.6030600@arcanedomain.com>
To: Dan Brickley <danbri@danbri.org>
CC: Tim Berners-Lee <timbl@w3.org>, "www-tag@w3.org" <www-tag@w3.org>
Dan Brickley asks:

 > Are browsers currently exposing the facility to put different flavours

Yes. Just to pick one example, selecting some formatted text in Firefox, 
and trying a Paste Special in MS Word on Windows offers:

* Unformatted text
* HTML Format
* Unformatted Unicode text

In that order.  There may be others on the clipboard that my version of 
Word is choosing not to offer.

On the same text pasted into Word, Internet Explorer 8 offers

* Formatted text (RTF)
* Unformatted text
* HTML Format
* Unformatted Unicode text

So, it looks like IE is offering to do conversion to Word format on the 
browser side too.   When copying images to the clipboard, Firefox seems 
to offer just

* Device independent bitmap

while IE offers:

* Device independent bitmap
* HTML Format

So, yes, it's common and IMO it's very useful.  I rely on it a lot.

Noah

Dan Brickley wrote:
> On Wed, Jun 2, 2010 at 9:45 PM, Noah Mendelsohn <nrm@arcanedomain.com> wrote:
>> Tim Berners-Lee wrote:
>>
>>> This I think seriously violates the function
>>> of Copy, and the user's rights.
>> Yes, I agree completely.  It's obnoxious, unhelpful, and contrary to the
>> spirit of the platform specifications for copy/paste.
> 
> [ yup ]
> 
>>> Should browsers ensure that Copy is always a
>>> read-only operation, unless they have INSTALLED code to do something
>>> different?
>> I agree with the spirit of what you're asking for, but I'm not sure the
>> words "read-only" capture the essence of what's needed.  Copy is, of course,
>> an operation that identifies data for transfer, and the corresponding paste
>> is necessarily an update operation on the target document or system.
>>
>> My deeper concern is that in fact certain sorts of data manipulation are
>> expected and useful, particularly when doing format conversions as part of
>> copy/paste.  So, for example, if I am reading an HTML document and I select
>> multiple paragraphs of text, it might well be appropriate for a copy
>> operation to put at least two versions on the clipboard:
> 
> Are browsers currently exposing the facility to put different flavours
> into different clipboards? I have lost touch with OS clipboard APIs,
> but if this is possible, it opens up some nice issues and maybe
> opportunities.
> 
>> HTML Clipboard format:
>> <p>Text of para1</p>
>> <p>Text of para2</p>
>>
>> Text Clipboard format:
>> Text of Para 1\n
>> \n\n
>> Text of Para 2
>>
>> I think it's important that whatever rules we set for browsers not prohibit
>> such helpful re-expression of the same information using different formats.
> 
> Two big 'customer' for a good spec here are namespaces and RDFa.
> 
> Consider the copy/paste scenario of someone highlighting a paragraph
> like "<p>In the 1980s I attended <a rel="foaf:schoolHomepage"
> href="http://schooloscope.com/establishments/26072">Westergate
> School</a>.</p>"
> 
> 1. From a namespaces perspective, copying exactly that markup loses
> the attachment to the declaration of the 'foaf" namespace. So a
> rich/intact/full/smart version of the data on the clipboard might be
> normalised somewhat, to pull in namespacing info.
> 2. From a webarch perspective, you need to know that rel="..." in this
> flavour of HTML/XHTML is one that has qnames in it, otherwise you
> won't notice that the namespace is being used. Or whatever
> namespaces-like mechanism the new RDFa WG proposes for their RDFa in
> HTML 1.1 syntax.
> 3. From an RDF/RDFa perspective, this fragment doesn't contain enough
> information to be be very useful. It says that there is *something*
> that has a certain schoolHomepage. As part of copy/paste behaviour,
> perhaps there is a need to go up the DOM to find the nearest enclosing
> chunk of RDF and determine the URI (if declared), type (perhaps?) or
> (complex but potentially useful) other identifying properties of the
> thing this is a property of.
> 4. From a privacy and etiquette perspective, anything that involves
> copying more info than the user has selected could be a gateway to
> abuse, or could leak more data than is intended (eg. copying from an
> intranet into a mail).
> 
> 
>>  We need to find a formulation that encourages such useful reformatting, but
>> prohibits the sort of inappropriate updates that are described in the Daring
>> Fireball posting. In any case, it doesn't seem to me that the term
>> "read-only" quite captures what we want.  Thank you.
> 
> I can imagine lots of scenarios where when copying out of data you
> might want to preserve structural integrity and copy out enough
> surrounding context (namespace declarations, base URIs etc) that the
> copied chunk keeps more of its original meaning. I can imagine a few
> scenarios in which doing so could be harmful. For example, a user
> copying a table cell from a confidential document might not want to
> reveal the context (perhaps a corporate takeover plan), yet that
> context might be implied by URIs used higher in the DOM. But I can't
> see a way of distinguishing between the two cases without asking the
> user, and I really don't like the idea of popups asking questions on
> this topic. Perhaps an experts 'smart copy' option is the best that
> could be done here?
> 
> cheers,
> 
> Dan
> 
Received on Saturday, 5 June 2010 18:35:55 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:33:06 UTC