Re: thoughts points system for silver from Hall, Charles (DET-MRM) on 2019-07-18 (public-silver@w3.org from July 2019)

From: Hall, Charles (DET-MRM) <Charles.Hall@mrm-mccann.com>
Date: Thu, 18 Jul 2019 18:30:54 +0000
To: John Foliot <john.foliot@deque.com>
CC: Chris Loiselle <loiselles@me.com>, Silver Task Force <public-silver@w3.org>
Message-ID: <E7534B04-004C-4D01-8E6A-23AFAA9A7C88@mrm-mccann.com>
I hear you, and the history, and also live in a world of time and materials.

Point of clarification:

I did not mean to suggest that measuring effort itself is not possible. However, I am suggesting that the level of effort to conduct this type of inclusive usability testing, and the level of effort to subsequently make design decisions and/or refactor any development, and then to prove causality of outcome is directly related to both sets of efforts is not possible because it includes too many subjective variables.

I am also suggesting that it should not matter. If the functional needs are met to travel from point a to point b, a 25 year old Yugo is as sufficient as a brand new Ferrari.

This position also supports the core notion that the web is accessible by default. It takes poor decisions to break that. Advocacy outside of the guidelines (most often) tells this story. Remediation is expensive. Designing inclusively is (nearly) free. My hope is that the guidelines help reinforce this type of narrative. If the guideline and conformance to it suggests that good alt text is difficult and expensive, that is a distorted lens and a disincentive. I teach our copywriters this (excerpt from presentation):

“To say, ‘A steaming mug of freshly brewed espresso’ is more accurate, meaningful and entertaining than ‘coffee’ and required no more effort.
However, it is more effort for the screen reader user if you overdo it and describe the entire scene and context.”

The impact on the creator is variable and ephemeral. Business models and market conditions drive creators to continuously find better and less expensive or alternative paths to solutions. In the alt text example, I could learn that I can cut my copywriter and photography budget in half by telling better stories and using fewer images to do it – which had the added benefit of being more performant. So, I spent less effort to achieve more accessibility. I could also have spent zero in the first place if I am a small business using free images from Unsplash that came with alternative text and accreditation. In either scenario, I should not earn fewer points.

I would also argue that some AAA criteria have no additional impact on creators than similar AA  or A criteria, like 1.4.6 compared to 1.4.3. It takes zero extra effort to decide upon 2 colors with a 7:1 ratio as it does 2 colors with a 4.5:1 ratio. The exact same applies to 2.3.2 compared to 2.3.1. “Don’t do it at all” is arguably easier to meet than “don’t do it unless.”

I am not saying that we ignore this impact on creator / level of effort factor. I honestly don’t know how or where best to consider it. But basing points and ultimately conformance on something so widely variable and not directly tied to human impact seems counterintuitive.


Charles Hall // Senior UX Architect

(he//him)
charles.hall@mrm-mccann.com<mailto:charles.hall@mrm-mccann.com?subject=Note%20From%20Signature>
w 248.203.8723
m 248.225.8179
360 W Maple Ave, Birmingham MI 48009
mrm-mccann.com<https://www.mrm-mccann.com/>

[MRM//McCann]
Relationship Is Our Middle Name

Network of the Year, Cannes Lions 2019
Ad Age Agency A-List 2016, 2017, 2019
Ad Age Creativity Innovators 2016, 2017
Ad Age B-to-B Agency of the Year 2018
North American Agency of the Year, Cannes 2016
Leader in Gartner Magic Quadrant 2017, 2018, 2019
Most Creatively Effective Agency Network in the World, Effie 2018, 2019



From: John Foliot <john.foliot@deque.com>
Date: Thursday, July 18, 2019 at 1:29 PM
To: "Hall, Charles (DET-MRM)" <Charles.Hall@mrm-mccann.com>
Cc: Chris Loiselle <loiselles@me.com>, Silver Task Force <public-silver@w3.org>
Subject: [EXTERNAL] Re: thoughts points system for silver

Hi Charles,

> I also have a pretty strong opinion that the level of effort of the author / creator is both immeasurable and moot.

Well, actually, in 3rd party development shops, level of effort is measured in hours-to-perform any given task: that is usually how they pay their staff, and bill their clients, and so it is both measurable and important. (At Deque, we bill routinely our clients on a combination of time and materials.)

When it comes to meeting specific requirements, some are trivially easy to do (adding the language of page declaration to the top of the site's template page(s)), whereas others are significantly harder (producing captions and audio-description resources for multi-media content). I'll also suggest that the provision of either (or failing to provide either) have different levels of impact on some (but not all) end-users.

Consider as well the provision of both alt texts as well as longer descriptions: I can have a tool plug in alt="image" (or AI generated alt="may contain images of text") for all of the image files on a site, and it would technically conform (yet be functionally useless); conversely, it takes time and thought to craft useful alt text, and more so a text description for complex images. Failing to acknowledge this time and financial impact on site owners would be (I posit) a real mistake. That said, I don't think this factor alone should have a direct impact on the scoring, but it *should* be part of a scoring calculation.

In my proposal, I am simply suggesting that 'effort' be used as a multiplier in a base-score calculation: in my straw-man proposal I suggested 3 levels of easy, harder, hardest. Easy has a multiplier of 1, harder is 2X and hardest is 3X. Then, as we look at individual requirements I am suggesting that impact on a user group or groups (or, more accurately user-requirement(s)) would also be a scoring factor. I had used a proposed level of 1 - 10, where lower benefit requirements have a lower impact value, and requirements with a higher user-benefit has a higher value. Thus the calculation for a base score per requirement would be (benefit to user X effort multiplier = base score).

I'll note in closing that this was also why some existing SC in WCAG 2.x are AA versus A (even if the requirements are both important to the end user) - that the impact on the creator was also a consideration in the A/AA/AAA calculation back during the WCAG 2.0 development days.

JF

On Thu, Jul 18, 2019 at 10:34 AM Hall, Charles (DET-MRM) <Charles.Hall@mrm-mccann.com<mailto:Charles.Hall@mrm-mccann.com>> wrote:
My understanding is that there is interest (but possibly not consensus) that the practice of usability testing – especially when it includes participation of people with a wide range of functional needs – is a behavior the guideline intends to encourage.

What is undecided / not agreed upon is how. If attached to conformance, then it must consider the level of effort and cost associated with that practice, because now there is a specific action dependency on ability to conform (more on effort below). If attached to a second currency, then that currency should have significant value, or there is little to no encouragement.

My opinion (and I say this as a UX person) is that testing itself is the wrong emphasis. What the guideline should encourage is outcomes. This point has been made in a few email threads: the act of testing is not an indicator that the results of testing and insights gained were applied or that those changes had any measurable human impact. I also have a pretty strong opinion that the level of effort of the author / creator is both immeasurable and moot. It is possible to create a conforming site {x} ways with {n} effort. It is equally possible to create a non-conforming site with clear barriers {x} ways with {n x n} effort. There is rarely causation or even correlation between effort and outcome, and when there is, it is fairly difficult to measure. It also scales down with maturity – in this case, accessibility maturity. So I could spend months and millions on usability testing and building or modifying a thing based on insights. The next thing I build or modify is going to take less effort to get the same outcome from both reusable patterns and institutional knowledge.


Charles Hall // Senior UX Architect

(he//him)
charles.hall@mrm-mccann.com<mailto:charles.hall@mrm-mccann.com?subject=Note%20From%20Signature>
w 248.203.8723
m 248.225.8179
360 W Maple Ave, Birmingham MI 48009
mrm-mccann.com<https://www.mrm-mccann.com/>

[MRM//McCann]
Relationship Is Our Middle Name

Network of the Year, Cannes Lions 2019
Ad Age Agency A-List 2016, 2017, 2019
Ad Age Creativity Innovators 2016, 2017
Ad Age B-to-B Agency of the Year 2018
North American Agency of the Year, Cannes 2016
Leader in Gartner Magic Quadrant 2017, 2018, 2019
Most Creatively Effective Agency Network in the World, Effie 2018, 2019



From: Chris Loiselle <loiselles@me.com<mailto:loiselles@me.com>>
Date: Tuesday, July 16, 2019 at 10:05 AM
To: Silver Task Force <public-silver@w3.org<mailto:public-silver@w3.org>>
Subject: [EXTERNAL] thoughts points system for silver
Resent-From: Silver Task Force <public-silver@w3.org<mailto:public-silver@w3.org>>
Resent-Date: Tuesday, July 16, 2019 at 10:04 AM

Hi Silver,

Just a thought off of today's call:


In regard to point system, would the fact that user testing was completed at a given organization during the development of a product give them extra points vs. not completing user testing at all?



For each demographic of user testing, grading all user tests equally, would someone who tests with a user that has limited sight and a user that is hard of hearing not receive as many points as someone that tests with someone who is Blind, someone who has low vision, someone who is Deaf,  someone who is hard of hearing, someone with a cognitive disability (etc.)?



What if the organization went deep on depth of testing with the user who is Blind and the user who has limited sight, but only went surface level (breadth) with multiple users each with a different disabilities vs. diving deep with two users ? Would those be weighted differently? The same? I know there was discussion on ribbons, points, badges, where would that come into play?


Thank you,
Chris Loiselle
This message contains information which may be confidential and privileged. Unless you are the intended recipient (or authorized to receive this message for the intended recipient), you may not use, copy, disseminate or disclose to anyone the message or any information contained in the message. If you have received the message in error, please advise the sender by reply e-mail, and delete the message. Thank you very much.


--
John Foliot | Principal Accessibility Strategist | W3C AC Representative
Deque Systems - Accessibility for Good
deque.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__deque.com_&d=DwMFaQ&c=Ftw_YSVcGmqQBvrGwAZugGylNRkk-uER0-5bY94tjsc&r=FbsK8fvOGBHiAasJukQr6i2dv-WpJzmR-w48cl75l3c&m=hQGSni69fz5uw5Ud6y9rs5Bc6CfFkDIbayKMLBpx6uY&s=29GfdntixMluUuQM211w6xdMZHOWz4xxt8ZxmSQUJu0&e=>
Attachments

image/jpeg attachment: image001.jpg
Received on Thursday, 18 July 2019 18:31:28 UTC