RE: thoughts points system for silver from Adam Cooper on 2019-07-27 (public-silver@w3.org from July 2019)

From: Adam Cooper <cooperad@bigpond.com>
Date: Sun, 28 Jul 2019 09:49:33 +1000
To: <public-silver@w3.org>
Cc: "'Detlev Fischer'" <detlev.fischer@testkreis.de>
Message-ID: <000401d544d5$f1c24060$d546c120$@bigpond.com>
“Please let us keep it as simple as possible.”

 

Agreed. It’s hard enough explaining the current scheme to disinterested parties let alone something more complicated … 

 

in my view, silver might be better served by focusing more on determining how to scope issues and how conformance might apply to entire websites rather than overcooking the conformance model itself. 

 

The ‘webpage as the unit of conformance’ might have worked for handcrafted static sites more than a decade ago, but it is no longer working for 300+ screen web applications.

 

Also, greater attention might be paid to ensuring that navigating and operating web content is more efficient as well as effective at a lower level of conformance … 

 

 

 

 

 

From: Detlev Fischer [mailto:detlev.fischer@testkreis.de] 
Sent: Wednesday, July 24, 2019 12:06 AM
To: public-silver@w3.org
Subject: Re: thoughts points system for silver

 

Am 19.07.2019 um 22:44 schrieb Frederick Boland:



Need to be careful that model does not get too complicated or difficult to implement - if it is hard to use or understand no matter how complete it is people will tend not to follow it.  Also model implementation and testing needs to be accessible as well


I strongly agree with the point Frederick is making.

All scoring systems proposed so far (including second currency, ribbons etc) seem a *lot* more complex then what we have now - both in their eventual application as well as in terms of explaining / understanding what the resulting score means.

I think it is important to be able to conduct a valid conformance evaluation and arrive at a meaningful score reflecting the object under test without the need for user tests. 

The results of task-based methods, if used, should be captured in a way that can be verified independently of any specific method. Example: A complex navigation forcing a sighted keyboard user in a user test to traverse endless submenus may result in a bad score. BUT the result can also be expressed as navigation distance (e.g. number of tabs needed to arrive at content) which could also be arrived at in heuristic testing. The link to 2.4.1 is obvious - so this might become a 2.4.1 task-based method that can be quantified (and its result be verified independent of particular users). Such a test might be combined with the headings- or landmark method of conforming to 2.4.1 so in Silver: they alone may no longer be sufficient to meet 2.4.1 or the Silver equivalent (since they do little for the average keyboard user).

BTW The three levels (automatic=1, manual=2, user test=3) mentioned under complexity of testing seem a bit misleading since most automatic tests require a second human step to complete or verify the automatic assessment.

Information on organisational behaviour, culture, sustainable practices and the like is desirable but in my view, belongs to a different strata and cannot be quantified easily. And quality is an obvious issue, as noted by others. Whether we call that ribbons, stars or whatever is not important. The issue is that merging that strata  and a score based on verifiable tests into *one* currency creates a fudge even if some required minimum level across 9 (or 4?) functional needs may somehow be established. It will be very complex to create such an overall score, to justify it, and to explain what it means. People may perceive it as so complex and arbitrary that they may cease to give any credence to such a compound conformance statement.

There will be many situations of a11y evaluation where testers will have no access to information on accessibility / usability evaluations and tests that may have been used (possibly by 3rd parties) in the design phase, during development, or in acceptance testing. A conformance model should be robust enough to be used in contexts when no such information is available to evaluators. 

To sum up:  I believe it is crucial for any score to be potentially verifiable independent of the specific tests (heuristic, user test or whatever) that were carried out. What matters is the evident accessibility of the site for the user at the point of use - not what the organisation has done (or claims to have done) to test (and in turn improve) it. 

Please let us keep it as simple as possible.

Detlev



 

Sent from my iPhone


On Jul 19, 2019, at 9:42 AM, John Foliot <john.foliot@deque.com> wrote:

Charles wrote:

 

> I am advocating that while that position is valid, it is secondary. 

We are not in opposition. Secondary, perhaps even tertiary... the point is, it *IS* a factor in any score calculation. If we want site owners to do more than just the absolute minimum, we need to make it worth their while. 

 

Bruce previously introduced the idea of multiple 'currencies', and all I am suggesting is that one of the currencies is effort: not the end-all or be-all, but a factor. Failing to recognize that, giving equal credit for the easier things as well as the harder things, will mean the easier things get done frequently, and the harder things get done less frequently (if ever), especially if you *really* want sites to progress beyond Bronze, which by its very nature and description will require additional effort.


> I feel that the guidelines should foremost be about prevention – how to avoid barriers in the first place.   

 

I do understand that. That too is a common sentiment in this Task Force. 

 

Prevention like that however cannot be measured, certainly not at scale or in the context of 'conformance', and measurement and a conformance model is a critical part of our deliverable. It is (I believe) important to remember that the mandate of the AG Working Group, and this Task Force, is just that: to create something that can be used for measurement of technologies and their accessibility:

 

Develop Silver (provisional name) to succeed WCAG 2.x, using a different framework to allow it to address more disability needs, emerging technologies, and support (for instance, through supporting techniques) for the technologies that impact accessibility such as digital content and applications, authoring tools, user agents, assistive technologies, software and web applications, and operating systems. This will use a different conformance model, which will require support from policy stakeholders.  
(source: https://raw.githack.com/w3c/wcag/charter-2019/charter.html)


Additionally, and to further emphasize this point, it is also important to remember that the ACT TF is another (sibling) Task Force under the AG WG 'umbrella' that is all about standardized testing rules (another form of measurement).

 

Thus, while I support the overall approach that Silver is taking in an educational vein (preventative) through plain-language guidance, and a more educational 'front-end' (etc.), what is also critical, and closer to our remit, is the measurement of progress and success, in part to meet the needs of those policy stakeholders.

 

The mission of the Accessibility Guidelines Working Group (AG WG) is to develop specifications to make content on the Web accessible for people with disabilities and to participate in the development and maintenance of implementation support materials for the Web Content Accessibility Guidelines.
       (and)
...Improving support for testing WCAG is an important priority for the Working Group.

(source:  https://www.w3.org/2017/01/ag-charter)

 

Contrasted with:

 

The mission of the  <https://www.w3.org/WAI/EO> Education and Outreach Working Group is to develop strategies and resources to promote awareness, understanding, implementation, and conformance testing for W3C accessibility standards; and to support the accessibility work of other W3C Groups.
(source: https://www.w3.org/WAI/EO/charter2017)

 

The differences are nuanced, but they are there: it's all about approach.

 

Experienced hands understand that it is critical for organizations to be pro-active in their accessibility efforts, and there is no disagreement that (as we measure things) "an ounce of prevention is worth a pound of cure", but from my perspective, and organizationally at the W3C, I see the AG WG as more about the ounces and pounds, and less so about the preventions and cures, which is the remit of the EO WG. Yes, they need to work together, and this is not about rejecting that aspect of a more holistic endeavor, but it is my personal belief that our approach should be from measurement towards prevention, and not prevention towards measurement, which I believe also more closely aligns with this Task Force's chartered mandate.

 

JF

 

 

 

On Fri, Jul 19, 2019 at 7:11 AM Hall, Charles (DET-MRM) <Charles.Hall@mrm-mccann.com> wrote:

John, 

 

I don’t think your points are random or invalid.

 

I totally hear the “fun” argument. And I had considered adding to the analogy as well, but decided to keep it as simple as possible. Since the guidelines are about utility, one could fit a wheelchair into the Yugo; and a service dog; and customize or adapt or extend it with aftermarket equipment. The point was simply that it is absolutely possible – and I would argue common (in my world) – to get more utility with less effort. So rewarding effort is not the same as nor as important as rewarding outcomes or human impact.

 

The other major point is that effort is transient and ephemeral. It changes rapidly, like technology. What is a high level of effort today could be free tomorrow. This means in the governance model for Silver, it would have to constantly re-evaluate the points or factor value associated with level of effort for each guidance.

 

This issue of cost or difficulty or level of effort for the creator seems pretty important to a number of people in this group. I suspect that is largely due to the position of examining the conformance of an existing thing and the current perceived cost of remediation. I am advocating that while that position is valid, it is secondary. The primary position should be on the creation of the thing in the first place. I feel that the guidelines should foremost be about prevention – how to avoid barriers in the first place. Then, it can serve the remediation side by helping creators recognize those barriers. But it doesn’t need to acknowledge the business impact of that remediation, and should instead speak to the human impact (which is also a business case).

 

 

Charles Hall // Senior UX Architect

 

(he//him)

 <mailto:charles.hall@mrm-mccann.com?subject=Note%20From%20Signature> charles.hall@mrm-mccann.com

w 248.203.8723

m 248.225.8179

360 W Maple Ave, Birmingham MI 48009 

 <https://www.mrm-mccann.com/> mrm-mccann.com

 

MRM//McCann

Relationship Is Our Middle Name

 

Network of the Year, Cannes Lions 2019

Ad Age Agency A-List 2016, 2017, 2019

Ad Age Creativity Innovators 2016, 2017

Ad Age B-to-B Agency of the Year 2018

North American Agency of the Year, Cannes 2016

Leader in Gartner Magic Quadrant 2017, 2018, 2019

Most Creatively Effective Agency Network in the World, Effie 2018, 2019

 

 

 

From: John Foliot <john.foliot@deque.com>
Date: Thursday, July 18, 2019 at 4:29 PM
To: "Hall, Charles (DET-MRM)" <Charles.Hall@mrm-mccann.com>
Cc: Rachael Bradley Montgomery <rachael@accessiblecommunity.org>, Silver Task Force <public-silver@w3.org>
Subject: [EXTERNAL] Re: thoughts points system for silver

 

Hi Charles,

 

Running with your analogy:

 

> If the functional needs are met to travel from point a to point b, a 25 year old Yugo is as sufficient as a brand new Ferrari.  

 

While this is indeed correct, I will also argue that riding in one is far superior than riding in the other - that the experience of getting from point A to point B is significantly better in the Ferrari then the Yugo, but that it also costs more. I'll never forget the day (and recount this story frequently) when my buddy and our colleague Victor Tsaran said to me "You know John, today we have the technology to make web sites accessible" (We were chatting about ARIA at the time.) He continued "But I can't wait for the day when they start making it fun".

 

Driving a Ferrari is more fun that pushing a Yugo down the street, and so if we are to measure "fun", I'd give the Ferrari a 9 but the Yugo a 2. 

 

Are those numbers subjective? You betcha, but among the people reading this email, I don't think we'd have wildly differing scores: some may consider the Ferrari ride should be a 10, or an 8, and/or the Yugo ride a 5 instead of a 2, but I don't think we'd ever get somebody arguing that the Yugo is more "fun" than the Ferrari (lacking a clear definition of fun in this discussion). So the take-away is that the more money you spend on the ride, the more "fun" it is - there is indeed a cost-benefit ration there that could be measured.

 

Continuing further with the analogy; fun alone isn't the only thing we're measuring when comparing the Yugo against the Ferrari (or vice-versa). Chances are, the Ferrari also burns more fuel than the Yugo, the cost of maintenance for the two vehicles may also vary wildly (or, be very, very similar), what's the cost of insurance... there are lots of "costs" towards the final goal, which I've expanded to more than just getting from point A to point B. And as we think deeper about what real equitable access means, we need to be thinking beyond the absolute minimum of getting to point B from point A, because from my perspective, I'd rather see a fleet of Ferrari's on the road than a collection of Yugo's, (and I also recognize that I may not be able to afford a Ferrari, but hey, my budget allows for a Lexus...)

 

Measuring the act of simply getting from point A to point B is of course important (perhaps the most important requirement), but there are numerous other factors - including effort and cost - that contributes to the overall experience, which has always been my understanding as the ultimate goal. And that unlike the binary Yogo or Ferrari, we're moving to a scale that recognizes other 'classes' of vehicle between Yugo (fail) and Ferrari (pass) options, like "Lexus" (70%)

 

More random thoughts

 

JF 

 

 

On Thu, Jul 18, 2019 at 2:39 PM Hall, Charles (DET-MRM) <Charles.Hall@mrm-mccann.com> wrote:

I believe that a “process that should ensure results” is one that would be rewarded by the second currency of ribbons model, since “should ensure” does not equal “meets need”.

 

I do agree that if the model supports any points or currency for the practice of usability testing, that said practice should verify (and possibly quantify) that it included people of a variety of functional needs. 

 

 

Charles Hall // Senior UX Architect

 

(he//him)

 <mailto:charles.hall@mrm-mccann.com?subject=Note%20From%20Signature> charles.hall@mrm-mccann.com

w 248.203.8723

m 248.225.8179

360 W Maple Ave, Birmingham MI 48009 

 <https://www.mrm-mccann.com/> mrm-mccann.com

 

MRM//McCann

Relationship Is Our Middle Name

 

Network of the Year, Cannes Lions 2019

Ad Age Agency A-List 2016, 2017, 2019

Ad Age Creativity Innovators 2016, 2017

Ad Age B-to-B Agency of the Year 2018

North American Agency of the Year, Cannes 2016

Leader in Gartner Magic Quadrant 2017, 2018, 2019

Most Creatively Effective Agency Network in the World, Effie 2018, 2019

 

 

 

From: Rachael Bradley Montgomery <rachael@accessiblecommunity.org>
Date: Thursday, July 18, 2019 at 3:04 PM
To: Silver Task Force <public-silver@w3.org>
Subject: [EXTERNAL] Re: thoughts points system for silver
Resent-From: Silver Task Force <public-silver@w3.org>
Resent-Date: Thursday, July 18, 2019 at 3:03 PM

 

Hello, 

 

When evaluating accessibility, I've noticed there are two approaches. One, exemplified by the current WCAG, evaluates results. The other, exemplified by the Disability Equality Index <https://urldefense.proofpoint.com/v2/url?u=https-3A__disabilityin.org_what-2Dwe-2Ddo_disability-2Dequality-2Dindex_&d=DwMFaQ&c=Ftw_YSVcGmqQBvrGwAZugGylNRkk-uER0-5bY94tjsc&r=FbsK8fvOGBHiAasJukQr6i2dv-WpJzmR-w48cl75l3c&m=bBaqwWJHz2lXmBEu-LGWJhhqPkbRMArGsZrSAfki2qY&s=FusdllGhGVrBFnYmuFlmilL55MEOBUwadSfX1-DD-S8&e=> , measures process that should ensure results.  I too have a UX background and in UX we can test the process  that should ensure results by asking if usability testing, cognitive walkthroughs, design documentation, etc were done. But usually we test the results which in UX is number of clicks, time to complete, number of errors, etc depending on the usability measure being tested.   

 

We are, in some ways, mixing apples and oranges by making a measure in silver whether the process is in place.  Would it make sense to instead state that usability measures tested should demonstrate a comparable experience in time, number of clicks, errors, etc. between people with and without disabilities?

 

This isn't a fully formed thought but rather a suggested line of thinking. 

 

Regards,

 

Rachael

 

On Thu, Jul 18, 2019 at 2:51 PM Hall, Charles (DET-MRM) <Charles.Hall@mrm-mccann.com> wrote:

My thought here was simply that outcomes are measurable by meeting the guideline.

I still also think / believe that from a conformance standpoint, an individual guideline starts with one or maybe two functional needs in mind when it is created, like “Use of Color” (addresses color perception specifically), and that meeting that need gets the points. But if I as a creator then test my solution with people that had multiple other functional needs and learn that a warning icon in addition to my red error text was a problem for people with anxiety disorders, and not using the word “error” in the text was a problem for people without usage of vision, and subsequently changed those 2 things to solutions that also worked for those functional needs, then I have essentially made a bigger human impact and somehow the score should reflect that.


Charles Hall // Senior UX Architect

(he//him)
charles.hall@mrm-mccann.com <mailto:charles.hall@mrm-mccann.com?subject=Note%20From%20Signature>
w 248.203.8723
m 248.225.8179
360 W Maple Ave, Birmingham MI 48009 
mrm-mccann.com <https://www.mrm-mccann.com/>


Relationship Is Our Middle Name

Network of the Year, Cannes Lions 2019
Ad Age Agency A-List 2016, 2017, 2019
Ad Age Creativity Innovators 2016, 2017
Ad Age B-to-B Agency of the Year 2018
North American Agency of the Year, Cannes 2016
Leader in Gartner Magic Quadrant 2017, 2018, 2019
Most Creatively Effective Agency Network in the World, Effie 2018, 2019



On 7/18/19, 11:57 AM, "Léonie Watson" <lw@tetralogical.com> wrote:


    On 18/07/2019 16:33, Hall, Charles (DET-MRM) wrote:
    > My opinion (and I say this as a UX person) is that testing itself is the 
    > wrong emphasis. What the guideline should encourage is outcomes...  > I also have a
    > pretty strong opinion that the level of effort of the author / creator 
    > is both immeasurable and moot. 


    I agree on both counts.

    Do you have any thoughts on how we might gauge the outcomes?


    Léonie.

    > 
    > *Charles Hall* // Senior UX Architect
    > 
    > (he//him)
    > 
    > charles.hall@mrm-mccann.com 
    > <mailto:charles.hall@mrm-mccann.com?subject=Note%20From%20Signature>
    > 
    > w 248.203.8723
    > 
    > m 248.225.8179
    > 
    > 360 W Maple Ave, Birmingham MI 48009
    > 
    > mrm-mccann.com <https://www.mrm-mccann.com/>
    > 
    > MRM//McCann
    > 
    > Relationship Is Our Middle Name
    > 
    > Network of the Year, Cannes Lions 2019
    > 
    > Ad Age Agency A-List 2016, 2017, 2019
    > 
    > Ad Age Creativity Innovators 2016, 2017
    > 
    > Ad Age B-to-B Agency of the Year 2018
    > 
    > North American Agency of the Year, Cannes 2016
    > 
    > Leader in Gartner Magic Quadrant 2017, 2018, 2019
    > 
    > Most Creatively Effective Agency Network in the World, Effie 2018, 2019
    > 
    > *From: *Chris Loiselle <loiselles@me.com>
    > *Date: *Tuesday, July 16, 2019 at 10:05 AM
    > *To: *Silver Task Force <public-silver@w3.org>
    > *Subject: *[EXTERNAL] thoughts points system for silver
    > *Resent-From: *Silver Task Force <public-silver@w3.org>
    > *Resent-Date: *Tuesday, July 16, 2019 at 10:04 AM
    > 
    > Hi Silver,
    > 
    > Just a thought off of today's call:
    > 
    > In regard to point system, would the fact that user testing was 
    > completed at a given organization during the development of a product 
    > give them extra points vs. not completing user testing at all?
    > 
    > 
    > 
    > For each demographic of user testing, grading all user tests equally, 
    > would someone who tests with a user that has limited sight and a user 
    > that is hard of hearing not receive as many points as someone that tests 
    > with someone who is Blind, someone who has low vision, someone who is 
    > Deaf,  someone who is hard of hearing, someone with a cognitive 
    > disability (etc.)?
    > 
    > 
    > 
    > What if the organization went deep on depth of testing with the user who 
    > is Blind and the user who has limited sight, but only went surface level 
    > (breadth) with multiple users each with a different disabilities vs. 
    > diving deep with two users ? Would those be weighted differently? The 
    > same? I know there was discussion on ribbons, points, badges, where 
    > would that come into play?
    > 
    > Thank you,
    > Chris Loiselle
    > 
    > This message contains information which may be confidential and 
    > privileged. Unless you are the intended recipient (or authorized to 
    > receive this message for the intended recipient), you may not use, copy, 
    > disseminate or disclose to anyone the message or any information 
    > contained in the message. If you have received the message in error, 
    > please advise the sender by reply e-mail, and delete the message. Thank 
    > you very much.

    -- 
    @TetraLogical TetraLogical.com



This message contains information which may be confidential and privileged. Unless you are the intended recipient (or authorized to receive this message for the intended recipient), you may not use, copy, disseminate or disclose to anyone the message or any information contained in the message.  If you have received the message in error, please advise the sender by reply e-mail, and delete the message.  Thank you very much.




 

-- 

Rachael Montgomery, PhD

Director, Accessible Community

rachael@accessiblecommunity.org

 

 




 

-- 

John Foliot | Principal Accessibility Strategist | W3C AC Representative
Deque Systems - Accessibility for Good
 <https://urldefense.proofpoint.com/v2/url?u=http-3A__deque.com_&d=DwMFaQ&c=Ftw_YSVcGmqQBvrGwAZugGylNRkk-uER0-5bY94tjsc&r=FbsK8fvOGBHiAasJukQr6i2dv-WpJzmR-w48cl75l3c&m=bUZyCZFMnsMPWp3UudvZubnMayvRWQS5Dd0C4leDmk8&s=HvYbz3wmqzmp6BvvnEAnjgNGX-q2qo0U1YZCso4SFl0&e=> deque.com

 




 

-- 

John Foliot | Principal Accessibility Strategist | W3C AC Representative
Deque Systems - Accessibility for Good
 <http://deque.com/> deque.com

 





-- 
Detlev Fischer
Testkreis
Werderstr. 34, 20144 Hamburg
 
Mobil +49 (0)157 57 57 57 45
 
http://www.testkreis.de
Beratung, Tests und Schulungen für barrierefreie Websites
Received on Saturday, 27 July 2019 23:50:11 UTC