Re: [csswg-drafts] [selectors] Solve :visited once and for all (#3012)

The CSS Working Group just discussed `Solve visited once and for all`.

<details><summary>The full IRC log of that discussion</summary>
&lt;emilio> Topic: Solve visited once and for all<br>
&lt;emilio> github: https://github.com/w3c/csswg-drafts/issues/3012<br>
&lt;fantasai> ScribeNick: fantasai<br>
&lt;fantasai> ScribeNick: emilio<br>
&lt;emilio> TabAtkins: So we discussed this about 6 month ago with no conclussion<br>
&lt;emilio> TabAtkins: visited is bad and leaks no matter what we do<br>
&lt;emilio> TabAtkins: we should just fix this, only limiting visited to stuff JS can already observe, and making it a regular pseudo-class<br>
&lt;fantasai> s/what we do/what we do, there's always some way to invoke timing channel attacks/<br>
&lt;emilio> TabAtkins: &lt;reads over the issue><br>
&lt;emilio> TabAtkins: Solving the three points on the issue solves the use cases that I think people care about, and doesn't expose more privacy information for most people<br>
&lt;fantasai>     At minimum, same-origin visitedness is always visible to the page, as the server can track its own cross-links, assuming standard tracking mechanisms exists (cookies, sufficiently high-entropy user identification, etc). So all same-origin links should report :visited.<br>
&lt;fantasai>     Cross-origin inbound links are always visible to the page if the Referer header was sent in the request.<br>
&lt;fantasai>     Cross-origin outbound links are always visible to the page if the user visited that link from this origin, as there are a multitude of ways to track outbound links (JS auditing, &lt;a ping>, link shorteners, etc).<br>
&lt;fantasai>     Any others?<br>
&lt;emilio> TabAtkins: these three should be safe to expose to :visited<br>
&lt;fantasai> &lt;/paste><br>
&lt;emilio> Thanks fantasai :-)<br>
&lt;emilio> TabAtkins: Last time Mozilla had some opinions on this<br>
&lt;fantasai> emilio: I thin the general position is that we should try this, but there were some concerns from other Mozilla ppl like Mats, that not keeping the existing restriciton would also not be GDPR compliant<br>
&lt;fantasai> dbaron: I think can try to represent Mats's position<br>
&lt;fantasai> dbaron: Basic idea is that in collecting the data about what sites ppl have visitors, browsers are collecting a substantial pool of privacy-sensitive data.<br>
&lt;fantasai> dbaron: They have an obligation to protect that data as much as they cna.<br>
&lt;fantasai> dbaron: In many cases, the sites themselves have not gathered that data.<br>
&lt;fantasai> dbaron: Given that we have a mechanism for protecting that data right now<br>
&lt;fantasai> dbaron: we don't want to expose that pool of data to sites right now, even if they could  collect it because we haven't.<br>
&lt;fantasai> TabAtkins: But how much is because we know we can extract this info right now?<br>
&lt;fantasai> TabAtkins: anything new you could get from this, you could get today via timing attacks.<br>
&lt;fantasai> TabAtkins: Defeating timing channel attacks here means running everything slower<br>
&lt;fantasai> TabAtkins: Doing the rendering work for visited all the time even if it's not being used on the page, etc.<br>
&lt;hober> q+<br>
&lt;fantasai> TabAtkins: Rmemeber the attach is running 10,000 stacked DOM elements with a filter on them if it's visited<br>
&lt;fantasai> fantasai: visited is on filter or?<br>
&lt;hober> q-<br>
&lt;fantasai> TabAtkins: visited below, 10000 visitors above it<br>
&lt;emilio> hober: I'm a little concerned about the usage of sites whose primary purpose is showing a bunch of links<br>
&lt;emilio> hober: it's pretty common to visually filter out the things that are visited<br>
&lt;emilio> hober: so it'd decrease the usefulness of sites we know are very popular<br>
&lt;emilio> TabAtkins: this is the only use case we kill<br>
&lt;emilio> TabAtkins: and I don't see a way to keep it<br>
&lt;emilio> AmeliaBR: my proposal is adding a safelist for history access, the same way browsers expose a setting for third-party cookies<br>
&lt;emilio> AmeliaBR: I don't think that possibility would need to be defined on the spec though<br>
&lt;emilio> TabAtkins: I'm concerned with trying to ask the user to usefully decide about whether Reddit should've access to all their browser history<br>
&lt;emilio> AmeliaBR: otherwise we get back to the same complications<br>
&lt;emilio> heycam: somebody suggested exposing the visited state in some way outside of the page<br>
&lt;emilio> heycam: like a little hover status-bar or such<br>
&lt;emilio> heycam: so there may be some way to expose this in the UI if there was an important necessity of keeping this use-case<br>
&lt;hober> s/a bunch of links/a bunch of links, e.g. reddit or hacker news/<br>
&lt;emilio> fantasai: I don't think I'd want to carefully hover over all the links when I'm searching for something<br>
&lt;emilio> TabAtkins: It'd handle search, since most of the links are found via search anyway, but it'd break link dumps and such<br>
&lt;emilio> florian: even via search, you might want to find something you've visited and you look at the purple link<br>
&lt;emilio> TabAtkins: yeah, but I don't think we can plug this privacy hole<br>
&lt;emilio> fantasai: if you turn Javascript, it can apply in more cases<br>
&lt;emilio> fremy: You may execute a timing attack measuring loading time? Though network may be not reliable enough generally<br>
&lt;emilio> heycam: so this issue seems to have two parts, changing how :visited matches, and changing the restriction of the properties that apply to it<br>
&lt;emilio> TabAtkins: there's no point in keeping the restrictions if we limit what's exposed<br>
&lt;emilio> florian: except the other argument about sites not having collected the data yet<br>
&lt;emilio> heycam: so last time we (Mozilla) discussed this internally, we said that we'd be happy to experiment with some restriction like that, but not with unrestricting the property<br>
&lt;emilio> TabAtkins: I don't see the point<br>
&lt;emilio> hober: compat with existing content, maybe<br>
&lt;emilio> AmeliaBR: do we have some general policy to deal with this "zombie CSS case"?<br>
&lt;emilio> TabAtkins: trying it<br>
&lt;emilio> fremy: I remember some weirdness with javascript links<br>
&lt;emilio> fremy: I think there's a fourth case which is a `javascript:` link I think currently the link becomes visited only until you refresh the page<br>
&lt;emilio> emilio: so same as links and `#hash` links<br>
&lt;emilio> TabAtkins: so dbaron mentioned it was feasible to mitigate side-channel attacks, how feasible do you think it is?<br>
&lt;emilio> dbaron: I think we could reduce the band-width of some of them, but never get rid of them entirely.<br>
&lt;emilio> dbaron: the amount of effort we could spend on this depends on how it competes with extracting the same data via other attacks like cache timing attacks<br>
&lt;emilio> TabAtkins: I'll try to push internally to do some experimentation in this regard<br>
&lt;emilio> TabAtkins: I know that Alex Russel is the original author of this idea and he'd be really happy<br>
&lt;emilio> AmeliaBR: I think it depends on how much users hate to break the search results use cases and such, but it'd give way more flexibility for authors<br>
&lt;emilio> AmeliaBR: if it's going to break sites major sites with user focus you can explain it, but I don't know what the reaction of the average user is<br>
&lt;emilio> hober: besides cleaning up and simplifying :visited, what's the argument for removing the restrictions?<br>
&lt;emilio> TabAtkins: it'd make :visited a regular pseudo-class for authors<br>
&lt;fantasai> Current spec: “Since it is possible for style sheet authors to abuse the :link and :visited pseudo-classes to determine which sites a user has visited without the user’s consent, UAs may treat all links as unvisited links or implement other measures to preserve the user’s privacy while rendering visited and unvisited links differently.”<br>
&lt;emilio> hober: it's weird, but do authors actually complain about that?<br>
&lt;emilio> AmeliaBR and TabAtkins: Yes<br>
&lt;emilio> AmeliaBR: there are use-cases and hacks to show or hide the "unread" using the color matching the background of the text<br>
&lt;emilio> AmeliaBR: and despite of all the restrictions we're still leaking the history<br>
&lt;emilio> AmeliaBR: just because CSS is so complex that if somebody changes rendering somebody smart can figure out<br>
&lt;emilio> florian: so we're annoying people for no good reason<br>
&lt;emilio> fantasai: &lt;quotes the spec> (see above)<br>
&lt;emilio> TabAtkins: that's because my patch was not accepted, because reality is much more complex<br>
&lt;emilio> dbaron: somebody said for no good reason, I think there's one other reason to think about which is a distinction between attacks that are clearly detectable and ones that are not.<br>
&lt;emilio> dbaron: a site can learn about your visited links via somewhat normal code, or via code that is obviously querying your history, and I think it's a distinction it's worth thinking about<br>
&lt;emilio> florian: so there's no technical distinction but maybe legal ones<br>
&lt;emilio> florian: I'd add "Javascript is off" to the list of "safe" scenarios, because then why not?<br>
&lt;emilio> dbaron: some attacks work without javascript, like loading images or fonts<br>
&lt;emilio> florian: alright, then not...<br>
&lt;fantasai> emilio: One question is, one of the objetions wfrom Mats was that websites haven't collected this data, and now we're exposign it<br>
&lt;fantasai> emilio: If we change how it works, a lot of existing history....<br>
&lt;fantasai> emilio: In order to imlement this, you need to change how you store history. It stops being a giant table of all the links you stored.<br>
&lt;fantasai> emilio: You need to track from/to lists.<br>
&lt;fantasai> emilio: That's new data, nobody has colleted it yet.<br>
&lt;fantasai> TabAtkins: Implementation-wise it'll be you start colecting data now, but then don't switch over for a few months<br>
&lt;fantasai> AmeliaBR: Tha'ts why Tab split into 3 parts, we can have different levels of support<br>
&lt;fantasai> AmeliaBR: E.g. SHOULD support :visited on same-origin<br>
&lt;myles_> q+<br>
&lt;fantasai> AmeliaBR: You can do that with info you currently have<br>
&lt;fantasai> AmeliaBR: Next steps could be smarter<br>
&lt;xfq> ack myles_<br>
&lt;Rossen> ack myles_<br>
&lt;emilio> fantasai: so I think one of the discussion is that something that doesn't match any of the categories does not get visited styling at all<br>
&lt;emilio> fantasai: so for same-origin you should be able to use whatever restriction you have<br>
&lt;emilio> *whatever styling you want<br>
&lt;emilio> fremy: that doesn't work because it's observable via timing attacks, and you still need to run styling twice to avoid them<br>
&lt;fantasai> fantasai^: We don't have to do that right now. We could do something more limited for right now while we figure it out<br>
&lt;emilio> AmeliaBR: so right now we have this visited styles and we ignore the properties, and we could check whether it's a same-origin link<br>
&lt;emilio> fremy: so memory-wise you double the cost of styling<br>
&lt;emilio> dbaron: only for link subtrees<br>
&lt;dbaron> dbaron: It's not all elements that need duplicate data, it's just links and their descendants.<br>
&lt;fantasai> fremy was talking about how right now need to store double styling for links, one for unvisited and one for visited.<br>
&lt;fantasai> current duplicated set is limited to just the properties that are allowed for visited; allowing all would mean duplicating all properties<br>
&lt;fantasai> ...<br>
&lt;fantasai> fremy: The other thing wanted to say is that even if you double the memory and you store all the properties twice and do everything twice.<br>
&lt;fantasai> fremy: you can have nested links<br>
&lt;fantasai> fremy: one same-origin and one not<br>
&lt;fantasai> fremy: Then you have to keep track of whether the difference in style is because of the visited styling of the first link or the nested one<br>
&lt;fantasai> emilio: when you have nested link, from the pov of the nested link and its descendant, the nested link is the only link that could be visited on the page<br>
&lt;fantasai> fremy: That's the restriction we have now. But going forward<br>
&lt;fantasai> emilio: Why I think this wouldn't work is you could detect the performance of styling a same-origin link inside a visied..<br>
&lt;fantasai> emilio: Let's say you have a cross-origin, and a same-origin link inside it<br>
&lt;fantasai> emilio: If you don't apply restrictions to that...<br>
&lt;AmeliaBR> Nested links don't really exist. If you create them from the DOM, browsers are a mass of incompatibility in all sorts of ways. But, you could have a `:visited + :visited` selector which could get into a mess of confusion...<br>
&lt;fantasai> emilio: ... as long as links are treated independently ... I have to think harder than this<br>
&lt;fantasai> fremy: Even if the thing we do now works, we have to have special exception so that when you do selector matching, if it's the first link that you encounter from the base...<br>
&lt;fantasai> fremy: Right now this is what browsers do. it's quite messy<br>
&lt;fantasai> fremy: If you allow some to keep all properties and others not, then you have to keep track. I don't think it's a good idea.<br>
&lt;fantasai> TabAtkins: I see why it would be complex at the minimum<br>
&lt;fantasai> AmeliaBR: Gets rid of one of the arugments for these changes, which is that it would simplify style matching<br>
&lt;fantasai> myles_: Timing attacks, one way to solve them is to have repaints mroe predicatble, either more or less often<br>
&lt;fantasai> myles_: Why not pursue that solution?<br>
&lt;fantasai> myles_: instead of making things mroe expressive<br>
&lt;fantasai> TabAtkins: I'm not sure how changing timing of repaints can really solve this<br>
&lt;fantasai> TabAtkins: E.g. on :visited it activates 10,000 filters<br>
&lt;fantasai> emilio: ...<br>
&lt;fantasai> emilio: You need to repaint every time the href changes<br>
&lt;fantasai> emilio: dbaron trid that, was big perf regression<br>
&lt;fantasai> myles_: That was a perf regression, but performing style selection /cascade wasn't?<br>
&lt;fantasai> dbaron: It wasn't the whole tree, just the links. And they usually don't have many descendants<br>
&lt;fantasai> myles_: so recomputing style is cheap but recomputing pixels is not cheap?<br>
&lt;fantasai> dbaron: I think the repainting patch that was a perf regression was to do more repainting than emilio said<br>
&lt;fantasai> dbaron: It repainted whenever an async history lookup finished<br>
&lt;fantasai> dbaron: You start a lookup, you get a result<br>
&lt;fantasai> dbaron: A lot of timing a attacks could resolve by repainting all links instead of just the one that chnaged.<br>
&lt;fantasai> dbaron: but that's really expensive<br>
&lt;fantasai> dbaron: At the time I wrote this, repaint was sync, async landed a week after<br>
&lt;dbaron> s/repaint/history lookup/<br>
&lt;fantasai> myles_: If we're allowing :visited to become more expressive, then we're not breaking any navigation sites<br>
&lt;fantasai> TabAtkins: The proposal was to allow :visited to do more by restricting where it can be used.<br>
&lt;fantasai> AmeliaBR: Changes the balance<br>
&lt;fantasai> AmeliaBR: Some cases get easier, others get impossible<br>
&lt;fantasai> AmeliaBR: Wrt just fixing timing attack level<br>
&lt;fantasai> AmeliaBR: Every time we introduce a new property, someone comes up with a new example<br>
&lt;fantasai> AmeliaBR: Also not all are timing attacks. Some are abusing user interaction<br>
&lt;fantasai> AmeliaBR: Taking properties we've got, making some elements invisible or visible<br>
&lt;fantasai> AmeliaBR: Using the fact that there's a rendering change and then using people to reveal what they're seeing on the screen<br>
&lt;fantasai> iank_: Nasty one is to have full-page pop-up and position X different.<br>
&lt;fantasai> fremy: Which X do you see?<br>
&lt;fantasai> fremy: That's the one you click on<br>
&lt;fantasai> TabAtkins: So even if we solve timing attacks, don't solve all the attacks<br>
&lt;fantasai> TabAtkins: That's why I want to do this in the first place<br>
&lt;fantasai> AmeliaBR: So going back to earlier discussion that, OK, it comes down to what are users going to say if we break the one use case<br>
&lt;fantasai> AmeliaBR: Are any browser teams willing to do some experimentation with that and try to see how many complaints you get?<br>
&lt;fantasai> TabAtkins: I think working with Alex Russell we can try something<br>
&lt;fantasai> Rossen: Do you feel like you have enough, Tab?<br>
&lt;fantasai> TabAtkins: Yeah.<br>
</details>


-- 
GitHub Notification of comment by css-meeting-bot
Please view or discuss this issue at https://github.com/w3c/csswg-drafts/issues/3012#issuecomment-467242862 using your GitHub account

Received on Tuesday, 26 February 2019 00:35:47 UTC