- From: Robert Eisele <robert@xarg.org>
- Date: Mon, 16 Jul 2012 01:02:40 +0200
- To: "Tab Atkins Jr." <jackalmage@gmail.com>
- Cc: whatwg@whatwg.org
2012/7/16 Tab Atkins Jr. <jackalmage@gmail.com> > On Sun, Jul 15, 2012 at 3:22 PM, Robert Eisele <robert@xarg.org> wrote: > > Browsers are very restrictive when one tries to access the contents of > > different domains (including the scheme), embedded via framesets. This is > > normally a good practice, but I'd suggest to weaken this restriction for > > the data: URI schema. > > > > I'm currently building an analysis system like Google Analytics, which > gets > > embedded into a website via a small JavaScript snippet. When I analyzed > the > > data, I came across a very interesting trick because I got a lot of > > requests (with the data from location.href) where the entire website was > > embedded into a data:text/html URI - except that all ads of the page were > > replaced. Fortunately, my tracking code has been left without > > modifications. > > > > But the scary thing is that this way you can monetize foreign content by > > simply embedding it somewhere you can direct traffic to. That's pretty > > clever, because the original site owner doesn't notice this abuse due to > > the fact that top.location.href isn't readable. Or even worse, he would > > never notice it at all when he doesn't sniff the URI with JavaScript, > > because image files would have no referrer. > > > > My final approach to convict the abuser is based on the fact, that the > > JavaScript was dynamically loaded from my server and that I can write to > > location.href. So I added this piece of code: > > > > if (top.location.protocol === 'data:') { > > top.location.href = 'http://example.com/trap/'; > > } > > > > But even then the referrer will not be passed to the server. So my > proposal > > is that the data URI schema gets an exception on this security behavior. > > The problem you outline is not directly tied to the solution you > present. You can scrape a site and display it as your own without any > fancy tricks, just by downloading all the resources and hosting them > yourself. This merely consumes a little more bandwidth for the > attacker, since they're hosting the images/etc themselves. > But you would get a valid referrer if the tracking code wasn't removed. The data: protects the abuser in an unecessary way. But you're absolutely right that the solution I present isn't entirly tied to the problem. > The correct solution to this kind of problem is legal - this is simple > copyright violation. > But if you don't have a chance to get information about the attacker, you can't sue him. I had the strange idea to use a prompt to ask the user for the original URL in his address bar. But as I said, that's strange. > > I'm not sure about the merits of your suggestion otherwise. It's > reasonable to make data: pages same-origin with their parent when > they're contained within something, but it seems dodgy to make them > same-origin with their *contained* pages as well. If not done > carefully, that could allow contained pages access to the data: page's > parent as well, or other cross-origin pages that the data: page is > containing. > Very intuitive thought, one could assume that data: pages are same-origin, or better that embedded data: pages are part of the current page. In this way, you wouldn't have the chance to get off the sandbox and access the parent. What would be a situation where a same-origin could be dangerous? > > ~TJ >
Received on Sunday, 15 July 2012 23:03:09 UTC