Re: Proposal for new version of Requirement 3.7 Motivtion from Denis Boudreau on 2019-04-11 (public-silver@w3.org from April 2019)

From: Denis Boudreau <denis.boudreau@deque.com>
Date: Thu, 11 Apr 2019 17:19:36 -0400
To: John Foliot <john.foliot@deque.com>
Cc: Shawn Lauriat <lauriat@google.com>, Wilco Fiers <wilco.fiers@deque.com>, Detlev Fischer <detlev.fischer@testkreis.de>, Silver TF <public-silver@w3.org>
Message-ID: <CAC=s1AhbzXZC3BkV66K=UBuL+yMSzK7MrMXHsCZYnUVwuq3-uA@mail.gmail.com>
JF wrote:
> Like the television character Mulder in the show X-Files, I too want to
believe. But having filled out more
> than one (US) VPAT over the years, the reality is that "Partially
Supports" (formally "Meets with Exceptions")
> tends to stay that way, and rarely gets fixed.

Very cute. Well played, sir.

JF also wrote:
> If Detlev's user "passes" something, Denis' user "struggles but completes
the task", and my user is "stopped
> dead in the water" - all on the same page/site simply due to varying
experience levels... how do we square that
> circular problem?

But surely, we all agree that the measurements or findings coming from the
usability testing the three of us hypothetically conduct to inform about
the inherent problems of a site contribute to identifying further issues.
By conducting these tests, we ultimately get to address new sets of  issues
and the process brings expected additional value. Issues found through
usability testing, as opposed to issues found through say, automated or
manual testing, tend to otherwise be missed by non-disabled accessibility
experts who just happen to know about WCAG. The kind of issues that are
raised by people with disabilities in usability testing will usually
relate to things we could easily miss just because we don't have those
disabilities ourselves. And that level of findings, when addressed,
definitely pushes the quality of the product further.

And JF finally wrote:
> Many of Deque's clients have thousands, if not hundreds of thousands, of
web "pages", and measuring
> conformance at that scale is already problematic. Introducing
user-testing into that scenario just made
> accessibility conformance testing significantly more expensive, and any
final conformance model will
> need to address this scale problem. User testing for conformance might
work at the boutique level,
> but at the enterprise level it's a bit of a pipe-dream... (IMHO)

Well, that's simply not true. The number of pages a site contains has very
little impact on the overall cost of usability testing when what you are
testing are flows, happy and not-so-happy paths, and precise tasks that you
are testing to validate some assumptions you may have about parts of the
interactions of interfaces you may have doubts about. This is not something
that only boutique shops should be able to do. This is something that can
just as easily be conducted by software companies, or big IT corporations,
if only those who work there get the value of why the whole effort is with
their time, energy and resources.

The problem is not whether usability is a pipe-dream in larger, more
complex contexts. I mean, quality and accessibility could just as easily be
considered pipe-dreams if we look at it that way.



/Denis


*Denis Boudreau, CPWA* | Principal Accessibility SME & Training Lead
| 514-730-9168
Deque Systems - Accessibility for Good
Deque.com <http://www.deque.com>





On Thu, Apr 11, 2019 at 10:11 AM John Foliot <john.foliot@deque.com> wrote:

> Denis wrote:
>
> > I believe that conducting testing with people with disabilities, when
> done genuinely with the goal of user experience improvements does
> absolutely change the quality of the site under test.
>
> Like the television character Mulder in the show X-Files, I too want to
> believe. But having filled out more than one (US) VPAT over the years, the
> reality is that "Partially Supports" (formally "Meets with Exceptions")
> tends to stay that way, and rarely gets fixed.
>
> Testing with users with disabilities isn't the same as remediating all
> issues they find, and to that end, I have to agree with Detlev:
> user-testing alone is insufficient in "boosting" a score - it's what comes
> *after* the user testing that is important, and so user-testing is a
> "process" not an end-state.
>
> Don't get me wrong - like the majority of us, I understand and appreciate
> the value of user-testing. It gives us a clearer and more informed and more
> nuanced picture of the (current) state of a web-site, but that activity
> alone does nothing to *improve* the accessibility, only to more clearly
> define the current state, good or bad.
>
> For example, I can visually see if and when I think target regions are too
> small, and/or I can "measure" those touch regions, and/or I can ask a
> mobility impaired user to try "clicking those buttons" - all three of those
> activities can be used to determine if touch regions are sufficiently
> big-enough, but why would involving an end user get me more "points"? As
> such, I also agree with Wilco - I too think a point system is an
> interesting idea, but not as part of a conformance model, which requires
> some measurable rigidity, even if we move from a Pass/Fail to a
> Bronze/Silver/Gold reporting mechanism.
>
> Additionally (and I've experienced this recently in the context of testing
> a site for a client under legal duress), not all users have the same skills
> or experience - and "issues" reported by some users may not actually be
> issues with the site/content at all, but rather the end user is
> inexperienced or is "anticipating" a behavior that isn't *mandated* (but
> might be nice to have). If Detlev's user "passes" something, Denis' user
> "struggles but completes the task", and my user is "stopped dead in the
> water" - all on the same page/site simply due to varying experience
> levels... how do we square that circular problem?
>
> Finally, as I've previously noted, I remain concerned about "scale" in the
> context of user-testing. Many of Deque's clients have thousands, if not
> hundreds of thousands, of web "pages", and measuring conformance at that
> scale is already problematic. Introducing user-testing into that scenario
> just made accessibility conformance testing significantly more expensive,
> and any final conformance model will need to address this scale problem.
> User testing for conformance might work at the boutique level, but at the
> enterprise level it's a bit of a pipe-dream... (IMHO)
>
> JF
>
> On Wed, Apr 10, 2019 at 1:16 PM Denis Boudreau <denis.boudreau@deque.com>
> wrote:
>
>> Hello all,
>>
>> Wilco certainly makes good points, but I guess I'm more optimistic than
>> he is about our ability come up with a process that would allow Silver to
>> give more importance to usability testing as part of a conformance model,
>> without negatively impacting certain demographics in the process.
>>
>> /Denis
>>
>>
>> *Denis Boudreau, CPWA* | Principal Accessibility SME & Training Lead
>> | 514-730-9168
>> Deque Systems - Accessibility for Good
>> Deque.com <http://www.deque.com>
>>
>>
>>
>>
>>
>> On Wed, Apr 10, 2019 at 10:30 AM Shawn Lauriat <lauriat@google.com>
>> wrote:
>>
>>> Wilco,
>>>
>>> I can't see us ever agreeing that, if you do more for people with
>>>> learning disabilities, you don't need to do as much for people with low
>>>> vision. Any point system we use can't be at a conformance layer or
>>>> guidelines layer. It has to be narrow, so we don't make the needs of one
>>>> group interchangeable with another. That means point systems at the success
>>>> criteria layer. WCAG already allows for this. Think of how color contrast
>>>> is done. Two success criteria, one at AA, one at AAA, using the same
>>>> measurement tool, with a lower threshold for AA and a higher one for AAA.
>>>
>>>
>>> Totally agree! We absolutely need conformance to cover different user
>>> needs and not allow someone to claim conformance for piling up methods for
>>> one user need and ignoring others. This requirement centers around
>>> providing a way to demonstrate and express a beyond-the-minimum level of
>>> accessibility, so building up from a base level of conformance, rather than
>>> replacing it with "awesome for blind users and broken if you have some kind
>>> of mobility impairment".
>>>
>>> Hope that helps!
>>>
>>> -Shawn
>>>
>>> On Wed, Apr 10, 2019 at 6:54 AM Wilco Fiers <wilco.fiers@deque.com>
>>> wrote:
>>>
>>>> Hey all,
>>>> I am skeptical about a point system as part of a conformance model for
>>>> accessibility. I think a point system is a cool idea, but not as part of
>>>> the conformance model.
>>>>
>>>> Point systems are great if you have different things you could do, that
>>>> lead to roughly the same end result. For example, the airports with bike
>>>> racks example is something that keeps coming up. You can do any number of
>>>> things to get more people to leave their car at home. Better public
>>>> transportation, encourage biking, encourage carpooling, etc. Any one of
>>>> them reduces cars, and all of them do it by a lot.
>>>>
>>>> Accessibility doesn't really work like that. Keyboard accessibility and
>>>> visible focus aren't interchangeable. Users need both of them. The few
>>>> places in WCAG where more than one option is acceptable, we've already left
>>>> the solution open (example: Bypass Blocks) or we've specified the available
>>>> options (example: Audio Description or Media Alternative).
>>>>
>>>> I can't see us ever agreeing that, if you do more for people with
>>>> learning disabilities, you don't need to do as much for people with low
>>>> vision. Any point system we use can't be at a conformance layer or
>>>> guidelines layer. It has to be narrow, so we don't make the needs of one
>>>> group interchangeable with another. That means point systems at the success
>>>> criteria layer. WCAG already allows for this. Think of how color contrast
>>>> is done. Two success criteria, one at AA, one at AAA, using the same
>>>> measurement tool, with a lower threshold for AA and a higher one for AAA.
>>>>
>>>> I can certainly see us having more "point systems" for different
>>>> requirements. You could require 8 points for non-text content at level A,
>>>> and 12 points at AA or whatever (just making up numbers). It might also be
>>>> possible to create a point system that will work for lots of success
>>>> criteria. But I don't see that working at the conformance level. A point
>>>> system where you exchange one user need for another seems pretty
>>>> problematic to me.
>>>>
>>>> W
>>>>
>>>> On Tue, Apr 9, 2019 at 1:59 PM Denis Boudreau <denis.boudreau@deque.com>
>>>> wrote:
>>>>
>>>>> I like the proposal with Chuck’s edits.
>>>>>
>>>>> I disagree with your position Detlev, but understand your concerns.
>>>>> The temptation to game the system would undoubtedly rise from some of the
>>>>> people out there that would want to be able to claim a quick path to
>>>>> success (oh yeah, we tested with people, and “they” said it was
>>>>> fiiiiiiine...).
>>>>>
>>>>> I’m just not able to agree with a statement such as:
>>>>>
>>>>> “[testing]... does not in itself change the quality of the site under
>>>>> test. An awful site stays awful even after a lot of user testing.”
>>>>>
>>>>> I believe that conducting testing with people with disabilities, when
>>>>> done genuinely with the goal of user experience improvements does
>>>>> absolutely change the quality of the site under test. The findings brought
>>>>> up by consulting those users is expected to bring forth positive changes.
>>>>> An awful site is supposed to get better as a result of the change that come
>>>>> from the activity of involving those users in the process. That’s just the
>>>>> nature of the activity. But we need a way to measure that clearly in Silver.
>>>>>
>>>>> I celebrate our vision of rewarding usability testing with end users
>>>>> with disabilities. It does expose our model to abuse - I certainly share
>>>>> Detlev’s concerns here - but I’m sure that as we get to defining the
>>>>> details of how the scoring system will pan out, we’ll find ways to reward
>>>>> usability testing for aspects that actually provide value, not for things
>>>>> that pay lip service to the idea of making the product or service
>>>>> accessible.
>>>>>
>>>>> As an example, we could consider pairing aspects of the usability
>>>>> testing sessions with tangible results or improvements that came directly
>>>>> from this testing. That way, the testing outcomes and related improvements
>>>>> could be linked to specific methods for instance, or techniques or whatnot,
>>>>> and we could measure just how many of the improvements came directly from
>>>>> involving end users with disabilities in the overall process. The more
>>>>> improvements came out direct end users contributions, the higher the points.
>>>>>
>>>>>
>>>>> /Denis
>>>>>
>>>>> —
>>>>> Denis Boudreau
>>>>> Principal accessibility SME & Training lead
>>>>> Deque Systems, Inc.
>>>>> 514-730-9168
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Apr 9, 2019 at 04:30 Detlev Fischer <
>>>>> detlev.fischer@testkreis.de> wrote:
>>>>>
>>>>>> As I have said before, I think the mere fact that testing with users
>>>>>> with disabilities has taken place should not be rewarded since it
>>>>>> does
>>>>>> not in itself change the quality of the site under test. An awful
>>>>>> site
>>>>>> stays awful even after a lot of user testing. If then, as a result of
>>>>>> such testing, the accessibility and/or usability is improved, that
>>>>>> should impact also the conformance to measurable criteria (whether
>>>>>> absolute or score-based) - and I am happy to see those criteria
>>>>>> extended
>>>>>> to realms so far difficult to measure.
>>>>>>
>>>>>> Am 08.04.2019 um 20:42 schrieb Jeanne Spellman:
>>>>>> > Here is the proposal for revision of Requirement 3.7 Motivation as
>>>>>> > requested by AGWG to make it measureable.
>>>>>> >
>>>>>> > Motivation
>>>>>> >
>>>>>> > The Guidelines motivate organizations to go beyond minimal
>>>>>> > accessibility requirements by providing a scoring system that
>>>>>> rewards
>>>>>> > organizations that demonstrate a greater effort to improve
>>>>>> > accessibility.  For example, Methods that go beyond the minimum
>>>>>> (such
>>>>>> > as: Methods for Guidelines that are not included in WCAG 2.x A or
>>>>>> AA,
>>>>>> > task-completion evalations, or testing with users with
>>>>>> disabilities)
>>>>>> > are worth more points in the scoring system.
>>>>>> >
>>>>>> >
>>>>>> >
>>>>>>
>>>>>> --
>>>>>> Detlev Fischer
>>>>>> Testkreis
>>>>>> Werderstr. 34, 20144 Hamburg
>>>>>>
>>>>>> Mobil +49 (0)157 57 57 57 45
>>>>>>
>>>>>> http://www.testkreis.de
>>>>>> Beratung, Tests und Schulungen für barrierefreie Websites
>>>>>>
>>>>>>
>>>>>> --
>>>>> /Denis
>>>>>
>>>>> --
>>>>> Denis Boudreau
>>>>> Principal SME & trainer
>>>>> Web accessibility, inclusive design and UX
>>>>> Deque Systems inc.
>>>>> 514-730-9168
>>>>>
>>>>> Keep in touch: @dboudreau
>>>>>
>>>>
>>>>
>>>> --
>>>> *Wilco Fiers*
>>>> Axe product owner - Co-facilitator WCAG-ACT - Chair ACT-R / Auto-WCAG
>>>>
>>>
>
> --
> *John Foliot* | Principal Accessibility Strategist | W3C AC
> Representative
> Deque Systems - Accessibility for Good
> deque.com
>
>
Received on Thursday, 11 April 2019 21:20:38 UTC