- From: John Foliot <john.foliot@deque.com>
- Date: Wed, 10 Jul 2019 13:16:06 -0500
- To: Jeanne Spellman <jspellman@spellmanconsulting.com>
- Cc: Silver Task Force <public-silver@w3.org>
- Message-ID: <CAKdCpxxS5u2qRGTc9WGjdgM-UBhu_rmk42e-dofGzYQdrTDE4w@mail.gmail.com>
Thanks Jeanne for sharing these. I've not spend the requisite time with the longer video, but did review the shorter one. I have to say that I am personally concerned by the seemingly definitive declaration that "...blind faith in big data must end..." as nobody (certainly not me) has suggested that we put blind faith in anything. But data is what we are working on and with; it is in many ways our stock in trade, and is what "measurement" is all about. Measurement is data (whether big or small). *Without data, you have nothing but opinion.* Ms. O'Neil states: "(1:49) *Algorithms are opinions embedded in code...*" That's one way of looking at it (Cathy O'Neil's *opinion*), however is it the *only* way of looking at it? I'll suggest that Ms. O'Neil has politicized "data" to fit her narrative; Merriam Webster *apolitically* defines algorithm <https://www.merriam-webster.com/dictionary/algorithm>, as "...*broadly:* *a step-by-step procedure for solving a problem or accomplishing some end. ... Algorithm is often paired with words specifying the activity for which a set of rules have been designed.*" That is what we are doing: we're defining "problems" (use-cases), and we're also proposing methods (step-by-step procedures) and rules (Requirements) to address those use-cases. Finally, we need a mechanism beyond Pass/Fail (100% or 0%) to measure the progress or success of solving the use-case scenario. In WCAG, we assumed a simple Pass/Fail approach which we now know is neither accurate nor fair, and so in Silver we're going with a "somewhere between black and white - i.e. a shade of gray" approach. Defining and measuring that gray will require math, and yes also opinions - the opinions of experts and concerned parties in the field. What makes one "method" preferable to another? Says who, and why? (Says experts, in their opinion, based on experience and... that's right, data). "(2:03) *That's a marketing trick ...because you trust and fear mathematics...*" Pfft. That is one woman's opinion - I neither fear nor trust math any more than I fear or trust physics - I know our understanding and use of physics is not always perfect, but like democracy, it's better than any of the other options available to me. "(7:20) *...and we have plenty of evidence of bias policing and justice system data...*" Am I the only one who finds it ironic that Ms. On'Neil is using selective evidence and data to make her point that evidence and data is biased? IMHO, she shot down her own argument right there, and she spends a good portion of the remainder of that video using and interpreting specifically selected data to make her point. For example, she surfaces *one use-case* of a recidivism risk algorithm that resulted in cultural bias in Florida to "prove" that her assertion that all algorithms have a bias: she used data to arrive at a conclusion. "(8:36) *...When they're secret, important and destructive...*" If I found one important take-away from this video, it was this: *the openness of our algorithm will be a critical component*. There is a world of difference between "Black-box" algorithms and open and transparent algorithms. Thankfully, we've already stated categorically that: ScoringPoint scoring system - we have been working on a point system and have a number of prototypes. This is what we most need help on. It must be transparent and have rules that can be applied across different guidance. We are not going to individually decide what a Method is worth because it doesn't meet the needs of regulators for transparency, it doesn't scale, and it is too vulnerable to influence. (source: https://docs.google.com/document/d/1wklZRJAIPzdp2RmRKZcVsyRdXpgFbqFF6i7gzCRqldc/edit#heading=h.acod2js7mcnj ) Ms. O'Neil continues, (8:52) "...*These are private companies building private algorithms for private ends*..." One of the advantages of doing this work in the W3C is to avoid this kind of 'private company' bias. Will we be 'perfect' in that goal? Likely not, because as Ms. O'Neil also noted, we don't live in a perfect world. But the openness of the W3C in it's mission will hopefully ensure that whatever we end up with will be *MORE* open than a proprietary system or solution. But it still won't be perfect. (9:43) "...*We know this, though, in aggregate*..." In aggregate? You mean, like "big data"? Funny how, when it supports her opinion, big data isn't so bad after all... (11:29) "*...We should look to the blind orchestra audition as an example. ...the people who are listening have decided what's important and they've decided what's not important...*" OK, so not so much then that big-data is "evil", or that algorithms are biased, but rather we need to be mindful of bias and decide what's important and what's not, so that we construct an algorithm (set of tests and steps) that, if not completely eliminates bias, flattens it significantly. That's a world of difference from saying that algorithms are "Weapons of Math Destruction". (Additionally, I'll note that "experts", aka '*the people who are listening' *decided what was and wasn't important, so they introduced a bias, perhaps we can call it an informed bias, there as well: seemingly a positive one that Ms. O'Neil's subsequent point that female employment increased 5-fold proved. So bias, in-and-of-itself isn't the real problem is it? *Rather, it's the awareness that bias plays in the calculation of the data.*) (12:12) "*...What is the cost of that failure?*" Indeed. Everything - EVERYTHING - has a cost/benefit ratio, and at scale regulators, lawyers, and their kind do risk analysis to weigh that cost/benefit ratio. This is why I've proposed that - all other things being equal - the greater the cost for success, the greater the value(*) in our scoring algorithm. If it costs more to accommodate and test to ensure that some users with some disabilities are not left behind, that needs to be rewarded appropriately, otherwise the cost/benefit ratio doesn't matter: the decision will be "Pay the fine - it's cheaper", and sadly, I've personally lived through that specific mind-set at a previous - un-named for obvious reasons - employment, where a senior compliance person confided in me that they figured that the executives were waiting for exactly that to happen before they went any further. So this is a real thing too. (* One of the things I'm still struggling with is our unit of measurement, so that it can be applied *proportionately* across our "rules" and "sets of steps" as part of the cost/benefit analysis.) Ms. O'Neil talks about recognizing what is and isn't important, and focusing on that to ensure the algorithm is un-biased. OK, but before we can determine if there is any bias, we also need to be thinking about bias towards whom? All users in aggregate, or specific users with specific needs (and if the latter, to what level of specificity?) There is no disagreement that currently, WCAG is today biased towards people with cognitive disabilities, but before we can even make that statement, we also have to recognize that people with cognitive disabilities is not a monolithic block, even when they are a sub-set of "all users". In real world terms however, they *are* a specific sub-set, with needs and requirements that are different or enhanced over the needs of others (that currently WCAG fails at). That's the definition of bias right there: "*prejudice in favor of or against one thing, person, or group compared with another*" (source: https://diversity.ucsf.edu/resources/unconscious-bias) - *but to recognize bias is to also recognize "groups". *For this reason, I continue to believe that accounting for the needs of these different groups will be a factor in the cost/benefit computation. And in fact, we've already spoken and thought at length about how to ensure that our new scoring system cannot be "gamed" to favor one group over another, so this Task Force has already accepted that there are different "groups" with differing needs. To state now that our scoring system should not account for different user-groups in the scoring algorithm, while at the same time working towards ensuring that different or specific user-groups are not biased-against by our scoring system is a contradiction that I am struggling with, and that I've not seen a valid response to. In the end, whatever we emerge with will need to be *consistently* measurable, repeatable, report-able, and scale-able across all sizes of sites and types of content. And like it or not, all of our documentation to date includes "points", and/or "values" which will need to be added (or subtracted, multiplied or otherwise processed), so math *will* be involved. (And that's OK.) My $0.05 Cdn. JF On Wed, Jul 10, 2019 at 9:21 AM Jeanne Spellman < jspellman@spellmanconsulting.com> wrote: > Cyborg asked me to send this around and asks that those working on > conformance watch it: > > TED Task: Cathy O'Neil - Weapons of Math Destruction > > There is a short version and the full version > > Short version: https://www.youtube.com/watch?v=_2u_eHHzRto > > Full version: https://www.youtube.com/watch?v=TQHs8SA1qpk > > I watched the short version and thought it was well done. It is about > various kinds of bias and not specific to PwD. Her points about the > data of the past continuing a bias into the future are cautionary. We > do not collect big data and our formulas are not sophisticated AI > algorithms, but the principles she cautions about apply, IMO. There are > people in accessibility doing research on algorithmic bias against PwD, > and there are broader lessons from the research that could apply to our > work. > > > > > -- *​John Foliot* | Principal Accessibility Strategist | W3C AC Representative Deque Systems - Accessibility for Good deque.com
Received on Wednesday, 10 July 2019 18:17:23 UTC