- From: Dirks, Kim (Legal) <kimberlee.dirks@thomsonreuters.com>
- Date: Mon, 15 Jul 2019 23:24:51 +0000
- To: John Foliot <john.foliot@deque.com>, "Hall, Charles (DET-MRM)" <Charles.Hall@mrm-mccann.com>
- CC: David MacDonald <david100@sympatico.ca>, Léonie Watson <tink@tink.uk>, Silver Task Force <public-silver@w3.org>
- Message-ID: <MN2PR03MB5166772897A5BE7D22354FE894CF0@MN2PR03MB5166.namprd03.prod.outlook.com>
Hello all,
I’ve finally waded through most of the thoughts about scoring. I have some reservations I’d like to outline, but please do keep in mind that I’m not a mathematician, coder, detail person, or medical professional. 😊 Also note that I’m using “website” generically to include software, mobile, etc. Finally, this got really long. Sorry.
I am very uncomfortable with using individual categories of disabilities for weighting or scoring. This kind of system feels like walking through a minefield because there are so many potential pitfalls. Considering “user needs” is critical when identifying accessibility guidance, but I’m uneasy about using those to weight or score compliance. I can’t say exactly why, but I keep thinking about a “separate but equal” kind of analysis, which is not what anyone should be doing.
My chief concerns are that
  1.  This opens a black hole of analysis. How to we identify all disabilities? In the example Léonie used about flashing content, there are additional user groups that could be impacted. Flashing content could potentially induce vertigo for people with motion sensitivities; it could induce migraines for some, and who knows what else? I don’t find this helpful because we might never finish this kind of identification and analysis. It doesn’t seem efficient and what is critical for one person might only being annoying for someone else. And it could even change day-by-day for the same person.
  2.  Ultimately, each of the accessibility guidelines are critical for someone. (If we have some that aren’t extremely important for real live people, we have a bigger problem!)
I’m slightly little less uncomfortable using categories of “user needs” and putting them in very large buckets as a multiplier for standards that could benefit many or multiple people. Yes, that still requires some level of disability identification or user needs, but we if we go with very general buckets, I’m less troubled. Ultimately, I still feel like this isn’t the most helpful way to figure out how to score or rate a website’s level of accessibility.
So, what to do? Unlike some in this group, I do think we need to consider difficulty of implementation. I’m fuzzy about specifics of how we would account for difficulty, however. Here’s an example of why I think we need to look at it. Let’s think of sites that are massive and complex, and that already exist today, (maybe Facebook). Let’s say Facebook was around for a long time so parts of it were coded in different frameworks by an ever-change team of coders over time. And let’s say we introduce a guideline that says sites need to be fully responsive and lose no content when in their most compact state. New accessibility guidance means retrofitting, and that truly can be cost-prohibitive for a company. Forcing my imagined version of Facebook to completely rebuild their entire structure would be massively expensive. It might mean they can’t make any other accessibility changes or any enhancements to other products. Ultimately, I do think difficulty of implementation needs some recognition, especially when retrofitting is involved.
While I think we do need to be realistic about what a company can afford to do, I don’t think difficulty of implementation should be on an even par with meeting using needs.
That got me thinking about some of the ideas advanced by others in the group. In particular, I’m thinking about the idea of having ‘ribbons’ or ‘threads’ of other criteria to measure accessibility. I think *this* is where weighting could come in. What if we also considered and gave points for a company’s holistic accessibility approach? This might include
  *   Difficulty of making something accessible
  *   Leadership efforts (like speaking at CSUN, being active members of Silver Community?)
  *   Having a clearly defined public-facing accessibility policy and a public way to contact the accessibility team for larger companies.
  *   Accessibility processes in place like
     *   Testing with real users (PWD) and being inclusive about design and every other step in the website development process.
     *   Required annual accessibility trainings for all employees
     *   Accessibility testing
        *   Automated testing
        *   Manual testing
        *   Regression & unit testing
  *   Having a diverse workforce
And so on. Let’s say there are two paths for earning accessibility points. Again, I’m a little fuzzy about this, but I don’t think the “other” path alone should result in even the lowest level of accessibility. (So even if a company excelled at all of the items on the list above, but didn’t make accessible products, they could not reach even the Bronze level of accessibility.) This is potentially a way to start weighting factors used in determining accessibility scores.
The two paths would be, first, by meeting user needs in the website. This should always be the core and give the most points. But what if we had another path that included items like those in the list above? If these were weighted less than meeting core accessibility guidance, that might produce some interesting results and advance how the society thinks about accessibility. We are in a position to think outside the box, and we’ve got some people in our group with really interesting ideas that I’d like to hear more about.
Kim
· · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
Kimberlee Dirks, JD
Pronouns: she/her
Sr. Accessibility Specialist
Thomson Reuters
the answer company
Individual email: kimberlee.dirks@tr.com<mailto:kimberlee.dirks@tr.com>
· · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·
"Disability is never the barrier."
  - Haben Girma
From: John Foliot <john.foliot@deque.com>
Sent: Thursday, July 11, 2019 2:12 PM
To: Hall, Charles (DET-MRM) <Charles.Hall@mrm-mccann.com>
Cc: David MacDonald <david100@sympatico.ca>; Léonie Watson <tink@tink.uk>; Silver Task Force <public-silver@w3.org>
Subject: Re: Thinking about points
Thank you for this Charles, as it also helps me better formulate my thoughts as well. In fact, I think you've kind of summed up where I was trying to get to, in that it's not about pitting one group against the other, but instead recognizing the inter-sectional needs and rewarding better behavior.
So if 1.1.1 is for usage without vision, and I meet that criteria, I get the baseline points for it. However, if I go beyond that or exceed the criteria by ensuring that the method used also meets additional functional needs, then I get more points for it. For example, if my non-text content includes both alternative text and figure captions that are each written in simple language, and I now meet “usage with limited cognition”. In this scenario, the behavior that is reinforced is always meeting more needs equals earning more points.
That's a really great example and way of thinking, and also underscores why I believe that without some form of base-line scoring metric, migrating our existing requirements into a new framework will lack that basic data and data-framework. Not only are certain requirements more explicit or demanding (and thus likely have different "points" in relationship to other requirements), but then there is the additional "adding of points" when they go beyond the bare minimum (i.e., your example of “usage with limited cognition”).
I argue that as we migrate content, we need to be accounting for these scenarios (which I'm not sure we are), and, most importantly (well, at least to me) what is all that worth? If baseline = X, and “usage with limited cognition” = X + Y, what do X and Y actually equal? 1? 10? 79.635? Why? And does every requirement have a baseline score of X (the same "X" as we' just discussed), or do different requirements start off with different baseline values (A, B, C, or X)? Why or why not?
As we advance this forward to more and more "eyeballs", we'll need to be prepared to answer and defend all of these decisions - perhaps more so than why we're also introducing ideas like cognitive walk-throughs and task-completion exercises (which will be fairly easy to justify doing the action, but significantly harder to justify why those actions get 'foo' number of points).
JF
On Thu, Jul 11, 2019 at 1:40 PM Hall, Charles (DET-MRM) <Charles.Hall@mrm-mccann.com<mailto:Charles.Hall@mrm-mccann.com>> wrote:
I love this idea and description of critical.
I just wanted to add my comments and question that we lacked time to cover on Tuesday – particularly because the manner in which “impact on users” was conveyed has sparked a tangent discussion on bias.
I don’t see bias in:
  *   identifying functional need
  *   using functional need as a factor in score
All current – and as near as I can tell, all proposed – success criteria have been created exclusively for a specific functional need. 1.1.1 Non-Text Content was written for “usage without vision”. And I will continue to emphasize need over user group, because there is bias in saying “blind people”, as it omits all the other scenarios where a person cannot see. Naming a disability also carries a secondary bias, as it implies a quantifiable demographic. There is no bias in saying that Non-Text Content exists for the benefit of usage without vision. It does not make it any more or less important than any other criteria for any other functional need.
What I have been advocating for and failing to adequately convey is a scoring scenario that acknowledges meeting additional functional needs that the original criteria was not written for. So if 1.1.1 is for usage without vision, and I meet that criteria, I get the baseline points for it. However, if I go beyond that or exceed the criteria by ensuring that the method used also meets additional functional needs, then I get more points for it. For example, if my non-text content includes both alternative text and figure captions that are each written in simple language, and I now meet “usage with limited cognition”. In this scenario, the behavior that is reinforced is always meeting more needs equals earning more points. This also meets the reality of intersectional and complex needs. What this doesn’t account for without a multiplier is what Leoni describes as “Critical”, which is what I was going to ask John. Can we replace the idea of severity meaning impact on author to one where severity is the true impact on users?
So, to me, what all the bias conversations have seemed to miss is in the number of criteria – or now guidelines. The bias is not that the point system is rewarding one need over another. The bias is in one need having more criteria available in support of it. So the way we should be discussing resolving that bias is in new criteria / guidelines.
As to impact on authors or difficulty to implement, I think this is irrelevant. One of many reasons is that a small business can simply use a free template and purchase a third party service to audit it and produce a fully conformant site, and the level of effort was an email and a small check. I am not suggesting we completely dismiss how difficult it is to create fully accessible live captioning or a date picker that works for every input type. But I am suggesting that should not be a factor for conformance. The goal is always to support people who are accessing / using / consuming the site (or app or ICT). The tools used to create will continue to get easier and more robust and cheaper. So today live captioning may be expensive. Tomorrow it won’t. The impact on people will stay the same.
Charles Hall // Senior UX Architect
(he//him)
charles.hall@mrm-mccann.com<mailto:charles.hall@mrm-mccann.com?subject=Note%20From%20Signature>
w 248.203.8723
m 248.225.8179
360 W Maple Ave, Birmingham MI 48009
mrm-mccann.com<https://urldefense.proofpoint.com/v2/url?u=https-3A__www.mrm-2Dmccann.com_&d=DwMFaQ&c=4ZIZThykDLcoWk-GVjSLmy8-1Cr1I4FWIvbLFebwKgY&r=TNPSf5_s1C7GQN2fKXOGh6t05xN18F4fA5Kt3Nyy0IU&m=kDWdTsI9agvjUpLVEKmdNWqYVWnsQPptK_Aph5dCjSo&s=8OQrr8DOp_8hCS4WZAYpRzGQL-sQpC2l4C043-Heuho&e=>
[MRM//McCann]
Relationship Is Our Middle Name
Network of the Year, Cannes Lions 2019
Ad Age Agency A-List 2016, 2017, 2019
Ad Age Creativity Innovators 2016, 2017
Ad Age B-to-B Agency of the Year 2018
North American Agency of the Year, Cannes 2016
Leader in Gartner Magic Quadrant 2017, 2018, 2019
Most Creatively Effective Agency Network in the World, Effie 2018, 2019
From: David MacDonald <david100@sympatico.ca<mailto:david100@sympatico.ca>>
Date: Thursday, July 11, 2019 at 2:05 PM
To: Léonie Watson <tink@tink.uk<mailto:tink@tink.uk>>
Cc: Silver Task Force <public-silver@w3.org<mailto:public-silver@w3.org>>
Subject: [EXTERNAL] Re: Thinking about points
Resent-From: Silver Task Force <public-silver@w3.org<mailto:public-silver@w3.org>>
Resent-Date: Thursday, July 11, 2019 at 2:04 PM
 Here's what I remember about trying to overcome bias from way back.
WCAG 1.0 had the concept of Priority 1, 2, 3 and the concern in WCAG 2.0 was that WCAG 1 assigned *priorities* to checkpoints that were addressing a specific need of a certain group. And therefore we were introducing bias against certain needs by using the word "priority 1, 2, 3".
We **tried** to address in 2.0 that by assigning a generic name of "Level (A, AA, AAA) to Success Criteria (which was loosely based on "checkpoints" in WCAG 1.0). We hoped the letters A, AA, AAA wouldn't assign priority and importance like the hierarchical numbers 1, 2, 3.
In my opinion WCAG 1 and 2 emphasised solutions for blindness because (1) Screen reader AT was fairly mature (2) with blindness we knew, in a general way, what to do and there was research. We knew that our recommendations would help blind people and we knew they were doable across technologies, languages and a variety of types and sizes of web sites.
We've been trying to overcome bias for a long time and it's hard to do. I'm not saying we shouldn't continue to try, but I expect that whatever we do in Silver, the next generation will see its inherent bias. Hopefully we can however, improve with each version.
Cheers,
David MacDonald
CanAdapt Solutions Inc.
Tel:  613-806-9005
LinkedIn
<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.linkedin.com_in_davidmacdonald100&d=DwMFaQ&c=Ftw_YSVcGmqQBvrGwAZugGylNRkk-uER0-5bY94tjsc&r=FbsK8fvOGBHiAasJukQr6i2dv-WpJzmR-w48cl75l3c&m=WXmM8HxSwR47uAjeGJXalNWd16JnQb39aDutrkk0xYE&s=bifX-kLYjAynUghbLQTOx0tkorKWoIxa3WLctS9TkcY&e=>
twitter.com/davidmacd<https://urldefense.proofpoint.com/v2/url?u=http-3A__twitter.com_davidmacd&d=DwMFaQ&c=Ftw_YSVcGmqQBvrGwAZugGylNRkk-uER0-5bY94tjsc&r=FbsK8fvOGBHiAasJukQr6i2dv-WpJzmR-w48cl75l3c&m=WXmM8HxSwR47uAjeGJXalNWd16JnQb39aDutrkk0xYE&s=qN2XGYuaSWdwnyBlp546ji4enRECSffcF40eWKOoIgc&e=>
GitHub<https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_DavidMacDonald&d=DwMFaQ&c=Ftw_YSVcGmqQBvrGwAZugGylNRkk-uER0-5bY94tjsc&r=FbsK8fvOGBHiAasJukQr6i2dv-WpJzmR-w48cl75l3c&m=WXmM8HxSwR47uAjeGJXalNWd16JnQb39aDutrkk0xYE&s=m1clEQNSrlctop_dBQlZM4Fu-6THuoyLnd3SmUyJXQ0&e=>
www.Can-Adapt.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.can-2Dadapt.com_&d=DwMFaQ&c=Ftw_YSVcGmqQBvrGwAZugGylNRkk-uER0-5bY94tjsc&r=FbsK8fvOGBHiAasJukQr6i2dv-WpJzmR-w48cl75l3c&m=WXmM8HxSwR47uAjeGJXalNWd16JnQb39aDutrkk0xYE&s=CR9hiyJpW_fMFOcTxCCgmzceZZg3MfBSf1tmL7rf8hc&e=>
  Adapting the web to all users
            Including those with disabilities
If you are not the intended recipient, please review our privacy policy<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.davidmacd.com_disclaimer.html&d=DwMFaQ&c=Ftw_YSVcGmqQBvrGwAZugGylNRkk-uER0-5bY94tjsc&r=FbsK8fvOGBHiAasJukQr6i2dv-WpJzmR-w48cl75l3c&m=WXmM8HxSwR47uAjeGJXalNWd16JnQb39aDutrkk0xYE&s=dVfEJgeP3gSE5MZnPRGKwTXuanzRyk6AJH9Uwxk6F_w&e=>
On Thu, Jul 11, 2019 at 12:29 PM Léonie Watson <tink@tink.uk<mailto:tink@tink.uk>> wrote:
Everyone,
I've talked with JF more about his proposed points system, in particular
about the part that worried me most on the call on Tuesday.
I'm going to try and share my thoughts with you. I make no claims about
any of this being final, concrete, or even entirely thought through, and
if I'm repeating anyone's ideas without realising - I'm sorry, and thank
you.
I was worried about the idea of prioritising requirements based on user
impact, because it will put people from one group into competition with
people from another.
Let's take two possible user needs/requirements:
Requirement 1:
"I want to be able to use headings to understand the hierarchy of content."
Requirement 2:
"I want to be able to understand the audio content of video."
I'm not suggesting these should be actual requirements, I'm just making
them up for the purposes of this email.
If we say that requirement 1 is orientated towards blind people, it
isn't critical, and assign it 10pts; then say requirement 2 is
orientated towards Deaf people, it is critical, and assign it 20pts; it
puts blind people and Deaf people into competition with each other, when
it comes to the way authors choose to collect points.
This doesn't seem like a good thing, and as it turns out I don't think
it was what John was proposing.
We then began walking through some ideas, one step at a time, and we
started with the premise that all requirements are worth the same points
to start with. Let's go with 10pts for want of anything else.
Note: I know this idea isn't knew!
We then thought about how to start differentiating between requirements,
without making it a competition between different groups of people.
We decided to identify how many user groups benefit from the requirement
being met. Requirement 1 arguably benefits blind people, people with low
vision, and people with cognitive disabilities; requirement 2 benefits
Deaf people and people with cognitive disabilities.
So requirement 1 is multiplied by 3 (making it worth 30pts), and
requirement 2 is multiplied by 2 (making it worth 20pts).
Note: the multiplier is based on the number of user groups that are
benefited, not the number of users, and this was a really important
distinction for me as JF and I talked. If we make it about numbers of
users, we re-introduce the competition between users problem, and as
previously noted that seems like a bad idea.
We then considered how many requirements were likely to benefit only one
user group. This is a question worth considering in more depth, but the
example that came to mind as JF and I talked was this:
Requirement 3:
"I want to be able to disable flashing content before it begins."
This requirement benefits one user group - anyone who will be exposed to
the risk of seizing when they see the content flash.
Using the model so far, requirement 3 would be worth 10pts because it
benefits only one user group. That completely fails to recognise how
critical this requirement is to people in that group though.
So we then thought about having different levels of criticality for each
user group. Let's say:
1. Useful
2. Needed
3. Critical
We could bikeshed on the names, so again, I'm just making them up for
the purposes of this email. Don't get too hung up on them just yet.
Requirement 1 is:
* Needed by blind people. That's a multiplier of 3 (1 for the user
group, and 2 because it's "needed" by that user group).
* Useful to low vision people. That's a multiplier of 2 (1 for the user
group, and 1 because it's "useful" to that group).
* useful too people with cognitive disabilities. That's a multiplier of
2 (1 for the user group, and 1 because it's "useful" to that group).
Requirement 1 therefore has a total multiplier of 7 (if you add up all
of the above), making it worth 70pts.
This still doesn't quite work as intended though, because requirement 3
would be worth 40pts compared to requirement 1 at 70pts.
Requirement 3 is critical to people with Photo-Sensitive Epilepsy. This
means it has a multiplier of 4 (1 for the user group, and 3 because it's
"critical" to that group).
There are different ways we might solve this, and I'm really winging it
at this point, but stick with me.
We could use a different points system for the criticality levels.
1. Useful
10. Needed.
150. Critical.
Maths is not my strong suit, so I'm sure many of you will take one look
at this and shoot it down, but hopefully you get the idea.
We could add another criticality level, perhaps "Life-saving", that
would only be used rarely, perhaps even only for this requirement.
Note: as I write this email, I realise that requirement 3 also benefits
people with cognitive disabilities who find moving/flashing content a
distraction.
Perhaps it would be useful to look more closely at the following things:
* How might we identify the different user groups?
* How many requirements are beneficial to 1 user group, 2 user groups, 3
user groups, and so on.
That information might help us figure out the maths with a bit more
certainty, even if we only use a small sample of requirements initially.
That's as far as we got. As I said at the start, I make no claims as to
the usefulness of any of it!
Léonie.
--
@LeonieWatson Carpe diem
This message contains information which may be confidential and privileged. Unless you are the intended recipient (or authorized to receive this message for the intended recipient), you may not use, copy, disseminate or disclose to anyone the message or any information contained in the message. If you have received the message in error, please advise the sender by reply e-mail, and delete the message. Thank you very much.
--
John Foliot | Principal Accessibility Strategist | W3C AC Representative
Deque Systems - Accessibility for Good
deque.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__deque.com_&d=DwMFaQ&c=4ZIZThykDLcoWk-GVjSLmy8-1Cr1I4FWIvbLFebwKgY&r=TNPSf5_s1C7GQN2fKXOGh6t05xN18F4fA5Kt3Nyy0IU&m=kDWdTsI9agvjUpLVEKmdNWqYVWnsQPptK_Aph5dCjSo&s=6PIwIKVBwB6Qz1La9AJxQEhp33lEfxlg_JtPuMnDKlU&e=>
Attachments
- image/jpeg attachment: image001.jpg
   
Received on Monday, 15 July 2019 23:25:52 UTC