- From: Nik Matsievsky <speed@webo.name>
- Date: Sun, 05 Sep 2010 19:09:40 +0400
- To: Sigbjørn Vik <sigbjorn@opera.com>
- CC: Bryan McQuade <bmcquade@google.com>, Anderson Quach <aquach@microsoft.com>, Jason Sobel <jsobel@facebook.com>, Zhiheng Wang <zhihengw@google.com>, "public-web-perf@w3.org" <public-web-perf@w3.org>
about security issues for performance timing measurement. I don't thin that the following is a very good idea but it can be a compromise between security and performance. We can cache all timing information for a given resource (as well as all browsers cache valuable HTTP headers: Cache, E-Tag, Content-Type, maybe something more) when it's loaded first time by user agent. And if any web app just wants to load its timing info - it gets this first-time-fetched info (and can't determine if current 3rd-party asset is cached or not). This restricts definition if current resource is cached (but browsers can enable this technique for 3rd-party resources only), but allow browsers to provide all timings for this asset (w/o sharing user private date). Maybe also we can check headers or a file on external (3rd-party) domain - similar as Flash app does. Or/and can make this optional (only if user wants to allow any web app to collect his/her browsing history this way - it can collect all timings for all resources). > Hi > (new on this list, so forgive me if I lack some background) > > For privacy reasons, I don't think allowing websites to read the > history of third party websites is a good idea. True, existing onload > timing gives a rough indication of something being cached, which we > can live with, but let us avoid making it worse. Even if > embarrassingDomain.tld allows all other websites to read information > about it, that doesn't imply that I as the user want this, > particularily not to bigBrother.tld. If I visit embarrasingDomain.tld, > I'd do it in private mode and/or delete the cache afterwards, maybe > even do it in a different browser and/or restart the browser. Note > that DNS information survives all of this (as it is cached in the OS > and gateway), and a third party website can still check if I've been > to embarrassingDomain.tld through DNS timing. Note that > embarrassingDomain.tld can explicitly opt in to communicate with third > parties today using other technologies, but if I as a user have > deleted my cookies and closed my tabs, embarrassingDomain.tld doesn't > have any information to share, so even in those cases the user is in > control. > > I can just imagine facebook's app, a list of websites users have > recently visited, and a list of which websites are most popular among > your friends... (Not trivial to implement as reading a value also sets > and thus destroys it, but doable.) > > Some other random thoughts is that such a facebook app could already > be made using CSS visited styling, at least for still cached history. > Accurate third party timing information also opens up for > cross-document messaging. A list of subdomains can each hold one bit > (visited/not visited), and both parties can read and write to these > bits. Not directly a security problem as this is strictly opt-in from > both sites, but probably an unwanted side effect. > > As a user, my browsing habits are personal, and I don't want to share > those, even if they might be valuable to websites, and even if > websites want them to be shared with other websites. (Just like > cookies, even though websites want supercookies which can be shared > among sites, this is disallowed.) Users might want to explicitly opt > in to sharing such information, so another possibility is that > browsers add a preference toggle (default same-domain only) for timing > information. We are unlikely to see 0 performance impact of Resource > Timing, so a user option might be the best way in any case. If we > don't add this in the spec, privacy concious browsers/add-ons are > likely to add a toggle in any case, and we'll see websites break due > to unexpected JS behaviour. > > Another options is that third party DNS lookup timing information is > off-limits in all cases, but other third party timing information is > available, but then the spec needs to ensure this isn't leaked by > allowing a website to read all other third party timing information > and deduce that the missing time is identical to the DNS lookup time. > This still reduces privacy though, it is possible to see which > resources from a third party site the user has loaded, and getting > such information changes from an art to an exact science, thus making > it easier. > > > On Fri, 03 Sep 2010 04:18:55 +0200, Anderson Quach > <aquach@microsoft.com> wrote: > >> Hi Bryan, >> >> Thanks for your thoughtful reply. I agree that much of the Resource >> Timing information such as the time taken to retrieve and load a >> resource can be easily determined and thus figuring out whether or >> not the resource was cached can be easily discovered with script today. >> >> However, it has been brought to our attention that we should not >> allow make this an capability of new platform interfaces such as >> Resource Timing. We are actively brainstorming solutions to mitigate >> this privacy attack. In fact, I'd love to hear some of your thoughts >> on approaches to mitigate this issue. >> >> We also want the interface to be easy to use. One of our aspirations >> is to be able to arrive at a solution where we can have Resource >> Timing on by default for all downloaded resources, however, not at >> the cost of impeding performance of the user-agent. This is an area >> where we will need technical investigations and prototypes. >> The goals for Resource Timing we have in mind are: >> * ease of store and access to the resource timings >> * negligibly impacting the user-agent's performance >> * efficient lifetime management of the resource timing objects >> * end-user security and privacy conscious >> >> Best Regards, >> Anderson Quach >> IE Program Manager >> >> -----Original Message----- >> From: Bryan McQuade [mailto:bmcquade@google.com] >> Sent: Tuesday, August 31, 2010 8:56 AM >> To: Anderson Quach >> Cc: Jason Sobel; Zhiheng Wang; public-web-perf@w3.org >> Subject: Re: Resource Timing >> >> Hi Anderson and Zhiheng, >> >> I wanted to follow up on this and share more of my thoughts now that >> I've had more time to think about it. >> >> I do not know the background on the security decisions for resource >> timing. Zhiheng said: "In the example here, you can look into the DNS >> time and TCP time of the resource fetched from otherdomain.com and >> figure if the user has recently (or even currently) visit >> otherdomain.com." >> >> This kind of information is already leaked by the browser and it is >> relatively easy to ascertain whether a user has visited a site >> recently due to the shared nature of the browser cache. You are right >> that for resource timing it will be very clear (dns/tcp times of >> zero) but it's easy enough to embed the URL of a resource known to be >> on some other page in your own page, and then time the onload for >> that resource. If it's short (under 10-15ms) it's likely from cache, >> indicating that the user has visited the site. If you look at >> resource expirations from a site you can even infer how recently the >> user visited that site. So the shared browser cache leaks more >> information today than the resource timing information would. >> >> I raise this issue because I do not expect we will see widespread >> adoption of this header-based opt-in approach, and the real value of >> resource timing hinges on web site operators being able to see how >> much latency is added by third party content. You will see adoption >> of the opt-in headers for the big players like google, facebook, and >> ms, where they have the size to force their content providers to >> enable these headers (or in some cases where they run all the >> services and can just enable these headers themselves). But small >> hosters will not be able to force third party providers to enable >> these headers, and they will be locked out of this valuable data by >> default. >> >> Further, adding new headers increases the weight of each response, >> which works against goals of making the web faster. >> >> I hope you will consider providing this data by default. Resource >> timing is going to be very useful and will empower web site owners to >> understand what's slowing their sites down, and allow them to put >> pressure on the slow third party content providers. But this can >> happen only if the site providers have access to the data they need, >> which is why I am advocating to make it available by default. >> >> -Bryan > -- Thank you, Nik Matsievsky, WEBO Software, www.webogroup.com +7 926 7281964 / skype:nikolay.matsievsky
Received on Monday, 6 September 2010 07:21:58 UTC