RE: severity & criticality metrics from Steve Green on 2024-08-06 (w3c-wai-ig@w3.org from July to September 2024)

From: Steve Green <steve.green@testpartners.co.uk>
Date: Tue, 6 Aug 2024 07:01:43 +0000
To: Adam Cooper <cooperad@bigpond.com>, 'Juliette McShane Alexandria' <mcshanejuliette@gmail.com>
CC: 'bryan rasmussen' <rasmussen.bryan@gmail.com>, "w3c-wai-ig@w3.org" <w3c-wai-ig@w3.org>
Message-ID: <PR3PR09MB5268DE086164858B32C4B666C7BF2@PR3PR09MB5268.eurprd09.prod.outlook.com>
I have argued against the use of severity/criticality metrics in functional testing for more than 20 years, because it’s ridiculous to try and distil something that’s multi-faceted and non-fungible down to a single word or number. It makes even less sense to apply it to accessibility issues. For people who insisted on categorising issues, Gerry Weinberg advocated using a scale of two values – are you going to fix it or are you not?

Since different people will have different perspectives on each issue, I recommend triage sessions involving the relevant stakeholders. Management by numbers doesn’t work, at least for this sort of thing.

Steve Green
Managing Director
Test Partners Ltd

From: Adam Cooper <cooperad@bigpond.com>
Sent: Tuesday, August 6, 2024 4:21 AM
To: 'Juliette McShane Alexandria' <mcshanejuliette@gmail.com>
Cc: 'bryan rasmussen' <rasmussen.bryan@gmail.com>; w3c-wai-ig@w3.org
Subject: RE: severity & criticality metrics

Hi Juliette,

So, in essence, as the back and forth in the GIT issue you cite demonstrates, either W3 has got the assignment of success criteria to levels of conformance wildly wrong in certain cases or the assignment of success criteria to their respective levels of conformance is just part and parcel of conforming to a specification.

In my experience in Australia, the majority of auditors or testers or evaluators or whomever undertakes a metrical analysis of web content against success criteria have likely never met someone with a disability let alone have access to good usability insights or information regarding populations of people with disability so their subjective assessments of ‘user impact’ (as if this is the only factor) are  highly contestable in my view.

That W3 has never made explicit the process for assigning success criteria to levels of conformance explicit apart from citing working group consensus is definitely a shortcoming of the specification, but it is largely irrelevant to honouring both its letter and spirit.

That practitioners don’t understand why this or that success criteria is assigned to this or that level of conformance is immaterial to identifying a failure of a given success criterion.

This is not a passionate defence of WCAG 2.x. Far from it. It is acknowledgement that this is the nature of the WCAG 2.x conformance model with all its many flaws. That is to say, it is what it is, and to overlay it with a severity metrics with its own many weaknesses and which has been imposed by the ICT industry undermines any claim to conformance in my view.

This is not to say, on the other hand, that WCAG 2.x should be without question – it’s a necessary and healthy debate to have – but the use of functional testing severity/criticality metrics should also be carefully scrutinised.

As I asked below, what is driving the use of severity/criticality metrics which have been co-opted from functional testing? Is it historical accident? Client expectation? Some kind of orthodoxy? Is WCAG 2.x in this regard so flawed it is no longer fit for purpose? How are conformance claims affected? How is the usability of web content impacted by these metrics for people with disability?



From: Juliette McShane Alexandria <mcshanejuliette@gmail.com<mailto:mcshanejuliette@gmail.com>>
Sent: Tuesday, August 6, 2024 12:52 AM
To: Adam Cooper <cooperad@bigpond.com<mailto:cooperad@bigpond.com>>
Cc: bryan rasmussen <rasmussen.bryan@gmail.com<mailto:rasmussen.bryan@gmail.com>>; w3c-wai-ig@w3.org<mailto:w3c-wai-ig@w3.org>
Subject: Re: severity & criticality metrics

Hi Adam,

In my experience the WCAG levels do not necessarily map to the severity/impact of the defect on the affected user group. In some cases it absolutely does... We always mark a failure of 2.1.1, a level A SC, as critical - the highest severity in our matrix.

However, a failure of text contrast, a level AA SC, could be minor to critical. Minor if the failure is in some text that may be considered less important/critical because it's repeating content provided elsewhere on the page (perhaps marketing the same product in virtually the same terms), critical if the failure is both significant (very low contrast) and for important content such as instructions or field labeling necessary to complete a process.

This issue related to conformance levels<https://github.com/w3c/wcag/issues/3889> encapsulates our challenges with using conformance levels to help clients prioritize defects for remediation.

Best,
Juliette

On 8/2/2024 7:52:01 PM, Adam Cooper <cooperad@bigpond.com<mailto:cooperad@bigpond.com>> wrote:
Thanks Juliette.

I was aiming to keep this off-list because it is off the topic of recent changes to the understanding documents, so I have changed the subject.

I have always been a little uncomfortable with a functional testing-style severity overlay on levels of conformance like the below.

The definitions of levels of conformance already take into account the impact of a failure of a given success criterion on the people using a product.

In Australia, most government agencies don’t bother resolving what the below schema calls moderate or minor defects in anything that would be considered a timely manner if at all regardless of the impact on conformance or user experience. And most won’t tolerate a ‘sev1’ or ‘critical’ in the definitions below because it still carries the meaning assigned to It by functional testing schemes (and they don’t think that accessibility is important).

So I am wondering what the purpose of using these descriptors really is (apart from that urge we all have about trying to get accessibility through to developers)?

The difficulties I have with these kinds of schemas are:

  1.  The assignment of severity metrics is largely subjective and may be inconsistent depending on the tester
  2.  Level AA defects can be ‘more severe’ than level A defects and therefore resolved first
  3.  this type of schema doesn’t account for defect penetration or frequency in its measuring of impact on people using a product
  4.  that there is some deeper and more profound interpretation and application of WCAG as demonstrated by the definition of minor defects below that isn’t relevant to S1 – S3

I am interested to know how the below schema fits with the concept of conformance and conformance levels in WCAG 2.x and why levels of conformance are not used on their own as a way of determining in which order defects are resolved? Is it historical accident? Orthodoxy? Client expectation?

Cheers,
Adam


From: Juliette Alexandria <mcshanejuliette@gmail.com<mailto:mcshanejuliette@gmail.com>>
Sent: Saturday, August 3, 2024 10:39 AM
To: Adam Cooper <cooperad@bigpond.com<mailto:cooperad@bigpond.com>>
Cc: bryan rasmussen <rasmussen.bryan@gmail.com<mailto:rasmussen.bryan@gmail.com>>; w3c-wai-ig@w3.org<mailto:w3c-wai-ig@w3.org>
Subject: Re: Recent changes to the WCAG 2.2 SC 2.1.1 Understanding page

Hi Adam,

Here are our explanations of severity levels:


Critical Defects

Critical defects represent the most severe level of accessibility issues on a website. These defects are characterized by their completely obstructive nature, which prevents users from accessing vital information, engaging with key components, or successfully completing essential processes on the website. Such defects pose a substantial barrier, making it impossible for users, especially those with disabilities, to utilize the website as intended. This category demands immediate attention and prompt remediation to ensure the website meets the WCAG 2.2 Level AA standards for accessibility.


Serious Defects

Serious defects are significant accessibility issues that create substantial barriers for users, particularly those with disabilities. While not entirely blocking like critical defects, serious defects severely hinder the user's ability to access information, interact with website components, or complete processes. These defects can lead to a frustrating and challenging user experience, necessitating considerable effort or alternative strategies to navigate the website. Addressing these defects is crucial for enhancing the website's usability and aligning with the WCAG 2.2 Level AA compliance standards.


Moderate Defects

Moderate defects are accessibility issues that, while not completely obstructive, still pose challenges for users. These defects often require additional navigation efforts, increased cognitive energy, and may complicate the user's understanding of the content. However, they do not entirely prevent access to information or completion of processes. Moderate defects can create an inefficient and tiresome experience for users, particularly those relying on assistive technologies. Rectifying these issues is important for improving the overall user experience and ensuring compliance with WCAG 2.2 Level AA guidelines.


Minor Defects

Minor defects are the least severe category of accessibility issues, typically involving technical non-compliance with specific WCAG specifications. These defects have minimal, if any, impact on the user experience. They might present challenges only in rare use cases, cause slight navigation inconveniences, or result in a bit of increased verbosity when using assistive technologies. While these defects are not critical for basic website functionality, addressing them contributes to a more polished and fully accessible website, adhering to the finer details of the WCAG 2.2 Level AA standards.

We recognize that not all defects will affect each user to the same level, but If a user will be affected, this is how we evaluate the severity.

We advise our clients to combine this with page traffic metrics, or essential flows and processes within their site.

Combining this typically results in the most impactful issues being resolved in the most critical areas first.

Of course, we do make it clear that they cannot claim full conformance unless they address all issues of all severity levels.

We are also very careful to differentiate between strict WCAG defects and usability issues. Both could have any severity, while the first is specifically mapped to a specification which has legal ramifications in many locales while the latter does not.

Best,
Juliette

On Fri, Aug 2, 2024 at 5:20 PM Adam Cooper <cooperad@bigpond.com<mailto:cooperad@bigpond.com>> wrote:
Many CRM platforms use proprietary and unconventional keystrokes for performing operations that would otherwise be ‘standard’ like pressing F4 to expand a dropdown in SAP …

the highly questionable assumption being that either people will trawl through their help pages or undergo intensive training and thereby come to know these keystrokes by osmosis.

But then, all CRM user interfaces are notoriously badly designed and poorly coded garbage so being unusable is par for the course, I guess.


From: bryan rasmussen <rasmussen.bryan@gmail.com<mailto:rasmussen.bryan@gmail.com>>
Sent: Saturday, August 3, 2024 2:22 AM
To: Steve Green <steve.green@testpartners.co.uk<mailto:steve.green@testpartners.co.uk>>
Cc: Patrick H. Lauke <redux@splintered.co.uk<mailto:redux@splintered.co.uk>>; w3c-wai-ig@w3.org<mailto:w3c-wai-ig@w3.org>
Subject: Re: Recent changes to the WCAG 2.2 SC 2.1.1 Understanding page

I actually find it very weird, the concept of undiscoverable keyboard interactions - why would this ever exist? Is iit sort of like Easter Eggs, the devs put n for them and nobody else?
I would expect that if someone puts in a keyboard interaction they want it to be discoverable and usable, and if it isn't that is actually a bug in their program that they would like you to point out whether or not it is an accessibility issue.


On Fri, Aug 2, 2024 at 2:23 PM Steve Green <steve.green@testpartners.co.uk<mailto:steve.green@testpartners.co.uk>> wrote:

Thanks to everyone for all the responses. They raise a couple of questions, though:



  1.  If data entry requires the use of an undiscoverable keyboard interaction (and we do encounter them), can we report a non-conformance of SC 3.3.2 (Labels or Instructions)? The normative text and Understanding page don't mention this at all - they focus entirely on the labelling of controls and data validation rules.



  1.  If undiscoverable keyboard interactions relate to functionality other than data entry, it appears that they don't violate any success criterion. Surely that can't be right.



After spending an hour trawling through GitHub, I have some understanding of it. It's pretty daunting for someone who doesn't use GitHub in their work. It's safe to say I would never have found that Commit page if I didn't know it existed. And the distinction between Issues and Discussions is far from clear.



I have subscribed to notifications and will participate as best I can. Sadly, membership is unaffordable for me.



I had a look at keyboard.html commits<https://github.com/w3c/wcag/commits/main/understanding/20/keyboard.html>, but it’s full of all kinds of stuff. What I really want is a changelog for each Understanding page (and perhaps other pages such as techniques). I have no idea how easy that would be, but I will raise an issue anyway.



Steve



-----Original Message-----
From: Patrick H. Lauke <redux@splintered.co.uk<mailto:redux@splintered.co.uk>>
Sent: Friday, August 2, 2024 10:37 AM
To: w3c-wai-ig@w3.org<mailto:w3c-wai-ig@w3.org>
Subject: Re: Recent changes to the WCAG 2.2 SC 2.1.1 Understanding page



Thanks Bryan, these are all useful and good observations.



To the original point, these are all things that are not normatively required by the SC, and never have been. Many auditors have added these in the own interpretation or what 2.1.1 should say, and that these factors are all involved in deciding whether or not content passes or fails 2.1.1, even though this was not in the spec per se. Hence the recent additions to the understanding in 2.2 tried to clarify this, as it historically led to inconsistent audit results.



P

--

Patrick H. Lauke



* https://www.splintered.co.uk/


* https://github.com/patrickhlauke


* https://flickr.com/photos/redux/


* https://mastodon.social/@patrick_h_lauke
Received on Tuesday, 6 August 2024 07:01:54 UTC