Re: After today's call from John Foliot on 2021-08-10 (w3c-wai-gl@w3.org from July to September 2021)

From: John Foliot <john@foliot.ca>
Date: Tue, 10 Aug 2021 13:23:33 -0400
To: John Foliot <john@foliot.ca>
Cc: WCAG <w3c-wai-gl@w3.org>
Message-ID: <CAFmg2sX8wVv9EQdY7v4Hk1H3PxnLuYG2hH5pNHweaGosCr3kPA@mail.gmail.com>
Doh!

s/*principals/Principles*

JF

On Tue, Aug 10, 2021 at 1:21 PM John Foliot <john@foliot.ca> wrote:

> Hello all,
>
> First, thanks to the chairs for allowing me to present my
> alternative scoring proposal. As noted on the call, while the PPT deck is
> available on a Google drive, Google's conversion of that deck to 'Sheets'
> breaks some of the formatting. If that is an issue for you (or if you are
> unable to access the Google drive, perhaps due to firewall considerations)
> please let me know and I would be happy to forward you a copy of the PPT
> deck if you are interested.
>
> While I did not spend any time focussing on the "callout bubbles" in the
> deck, each comment comes from the first round of feedback, and is linked in
> the deck to the Issue in GitHub.
>
> *Recap of the main ideas:*
>
>    - Two ways of achieving "points" that work in tandem - unambiguous
>    unit tests, and adoption of protocols.
>
>    - Use EARL (mandated) to report adoption of Protocols (the public
>    declaration/public accountability piece). EARL could also be used in
>    reporting the scope (User Generated discussion for example), and because
>    EARL can be outputted in multiple formats, the data could also be exported
>    as JSON fragments, which could be used in dashboards and even (use your
>    imagination :-) ) dynamically generated "scores" (think badging. etc.).
>
>
> *Unit Tests and Points:*
>
>    - I proposed *weighting individual unit tests* based on impact across
>    the Functional Categories: my argument being that the more Category groups
>    impacted, the more 'valuable' the unit test outcome becomes. (This is
>    intended to help dev teams focus, not just on low-hanging fruit, but
>    actually more 'critical' outcomes/requirements based on known user-needs -
>    because *that* specific unit test has more 'value'. It also helps address
>    the "Critical Failure" question, as there are truly very few "critical
>    errors", but plenty of 'significant to the point of failure' errors - but
>    often only critical to one of the 14 Functional Category user-groups. If we
>    adopted weighted scores, we might also consider 'adjusting' the scores for
>    some unit tests to make them more 'valuable' - although could also be a
>    slippery slope.)
>
>    - I propose using the *principals as a means of adding equity to the
>    scoring*: There may be more unit tests (by count) under the
>    "Perceivable" category and fewer under "Understandable", but if the final
>    percentile score for each category contributes equally to the final score
>    (i.e. either contributes up-to "20" (%)) then focussing on the
>    Understandable unit tests becomes equally as important as the Perceivable
>    unit tests.
>    (This is intended to off-set the complaint that there is, and likely
>    always will-be, fewer unit tests for "Understandable" - which tracks back
>    to COGA concerns with our current system)
>
>    - *For discussion*: do we continue to include the "R" (Robust)
>    Principle, and is that Principle 'as important' as the other 3?
>
> *Protocols and Assertions:*
>
>    - Rather than attempting to measure subjective determinations, we
>    instead reward content owners for PUBLICALLY adopting Protocols related to
>    Usability and Accessibility (e.g. Making content Usable for COGA, WCAG 3
>    Maturity Model, US Fed Plain Language Guidelines, etc.)
>
>    - Protocols come in two 'flavors' (AGWG vetted and weighted) and/or
>    "Custom" (non-AGWG-vetted, BUT MUST BE PUBLICALLY AVAILABLE via a public
>    URL)
>
>    - Vetted Protocols are worth more points, as we have the ability to
>    impact what they state, and/or have met our internal review.
>    (Looking wayyyy down the line, I could anticipate entities seeking our
>    WG to vet *their* protocol with an eye towards making that protocol more
>    'valuable'. As a strawman example, Adobe recently published their
>    'Spectrum' guidance - https://spectrum.adobe.com/page/principles/ -
>    and they *might *seek to have that Protocol evaluated and 'scored'
>    differently by our Working Group. It is my personal opinion that this would
>    be both a good thing, and something our WG could encourage)
>
>    - As another strawman, in my proposal I suggest a 'maximum score' of
>    20 points under the Protocols and Assertions 'column', but that is a TBD
>    (as is/will be assigning value points to Protocols, and we'll likely need
>    to identify a core set of those Protocols to start. I've started a list.)
>
>    - Integral to this piece of the proposal is the mandated use of EARL
>    for the public declaration / public accountability reporting.
>
>
> *What I propose to 'drop':*
>
>    -  attempting to measure "user flows" or "happy paths", as we simply
>    cannot predict that for all users
>    -  counting instances of failures (i.e 2 of 100 images lacking alt
>    text does not = 98, it equals zero for THAT VIEW)
>    -  attempting to measure or evaluate usability
>
> JF
> --
> *John Foliot* |
> Senior Industry Specialist, Digital Accessibility |
> W3C Accessibility Standards Contributor |
>
> "I made this so long because I did not have time to make it shorter." -
> Pascal "links go places, buttons do things"
>


-- 
*John Foliot* |
Senior Industry Specialist, Digital Accessibility |
W3C Accessibility Standards Contributor |

"I made this so long because I did not have time to make it shorter." -
Pascal "links go places, buttons do things"
Received on Tuesday, 10 August 2021 17:24:16 UTC