- From: Robin Berjon via GitHub <sysbot+gh@w3.org>
- Date: Thu, 03 Mar 2022 17:12:41 +0000
- To: public-patcg@w3.org
Kiran asked a good question on the public list that for some reason was not captured here as well. I am answering here to make sure we keep it in a single place. He asked if the issue with consent "is a limitation of browsers which cannot share significant portions of cross-context reading history at scale?" The short answer is that this isn't a limitation of browsers but a limitation of what people can consent to through the kind of large-scale interaction that exist on the Web and through browsers. But if you don't have the background on this topic, I think that this answer won't be satisfactory. So I thought it would be helpful to provide a short backgrounder on consent so that not everyone has to read all the things just to reach the same conclusion. In the interest of brevity I will stick to the salient points regarding consent that have brought us to the present day; experts on the topic should of course chime in if they feel I've missed an important part. Informed consent as used in computer systems today (and specifically for data processing) is an idea borrow from (pre-digital) research on human subjects. One particularly important foundation of informed consent is the _Belmont Principles_, most notably the first principle, _Respect for Persons_. The idea of respect for persons is that people should be treated in such a way that they will make decisions based on their own set of values, preferences, and beliefs without undue influence or interference that will distort or skew their ability to make decisions. The important thing to note here is that respect for persons is meant to protect people's _autonomy_ in contexts in which their ability to make good decisions can be impaired. The way that this is operationalised in the context of research on human subject is through informed consent. At some point, someone looked at this and realised that things like profiling, analytics, A/B testing, etc. look a lot like research on human subjects (which is true). And so they decided to just copy and paste informed consent over on computers, with the expectation that it would address problems of autonomy with data. As often happens when people copy the superficial implementation onto computers but without the underlying structure that makes it work, this fell apart. First, one key component of research on human subjects is the Institutional Review Board (IRB), an _independent_ group that reviews the research for ethical concerns. IRBs aren't perfect, but using an IRB means that _in the vast majority of cases unethical treatment is prevented before any subject even gets to consent to it_. Some companies do have IRBs (The Times does, as does Facebook for instance) but they can never be as open, independent, and systematic as they are in research. Second, the informed consent step is slow, deliberate, with a vivid depiction of risks. Subjects are often already volunteers. You might get a grad student sitting down with you to explain the pros and cons of participation, or a video equivalent. What's really important to understand here is that informed consent is not about not using dark patterns and making some description of processing readable; it's about relying on an independent institution of multidisciplinary experts to make sure that the processing is ethical and on top of this independent assessment of the ethics of the intervention taking proactive steps to ensure that subjects understand what they are walking into. There _are_ Web equivalents of informed consent — studies based on Mozilla Rally are a good example of this — but they work by reproducing the full apparatus of informed consent and not just the superficial bits that make the lawyers happy. Rally involves volunteering (installing an extension), gatekeeping to ensure that studies are ethical (eg. the Princeton IRB validated the studies I'm in), volunteering _again_ to join specific studies and being walked through a description before consenting, and then strong technical measures to protect the data (like, it is only decrypted and analysed on devices disconnected from the Internet). None of this scales to the kind of Web-wide data processing that is required to make our advertising infrastructure work (or to enable many other potentially harmful functions). People have tried, but as shown repeatedly by the research I linked to previously (and more generally all the work on bounded rationality) it doesn't work. What "doesn't work" means is that relying on consent for this kind of data processing means that you end up with a lot of people consenting when in fact they don't want what they are consenting to; they are only doing it because the system is directing them in ways that don't effectively align with the requirements of informed consent. (To give just one example, Hoofnagle et al. have found that 62% of people believe that if a site has a privacy policy that means that the site can't share their data with other parties. Informed consent means eliminating that kind of misunderstanding and then providing a detailed explanation of the risks. It's a steep hill and few people have to time for it.) One possible reaction upon learning this is to not care. Some people will say "well, it's not my fault that people don't understand how privacy law and data work — if they don't like it, we gave them a 'choice'." But giving people a choice that you already know they will get wrong more often than not isn't ethical and doesn't align with respect for people. As members of the Web community, however, we don't want to build unethical things. The Web is built atop the same ethical tradition that produced informed consent in research on human subjects: respect for persons. (We formulate it as putting people first, but it's the same idea.) Since we try our best to make decisions based on reality rather than on what is convenient, we can't in good conscience see that consent doesn't work and then decide to use it anyway. There is also a fair bit of evidence that relying on consent favours larger, more established companies which makes consent problematic from a competition standpoint as well. Because of this, it is incumbent upon us to build something better. (In a sense, we have to be the IRB that the Web can't have for every site.) Is it technically possible to overturn this consensus? Of course. But we have to consider what the burden of proof looks like given the state of knowledge accumulated over the past fifty years that people have been working on this. Finding a lack of consensus requires more than just someone saying "I disagree," it would require establishing that respect for persons is secondary (and reinventing informed consent on non-Belmont principles) or that bounded rationality isn't real or high-powered empirical studies showing that people aren't tricked out of their autonomy or some other very significant scientific upheaval. It might be possible, but we're essentially talking about providing a proof of the Riemann hypothesis using basic arithmetics: I don't believe that it's been shown that you couldn't do that, and there are very regularly people who claim to have done it, but it would be unreasonable to put anything on hold for that in the absence of novel, solid evidence. I hope this is helpful for people who haven't been wrestling with this topic. What the charter delineates is helpful because it protects this group from walking down blind alleys that have been explored extensively with no solution in sight. If people find this kind of informal background helpful, I would be happy to document it more prominently. -- GitHub Notification of comment by darobin Please view or discuss this issue at https://github.com/patcg/proposals/issues/5#issuecomment-1058276865 using your GitHub account -- Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config
Received on Thursday, 3 March 2022 17:12:43 UTC