Re: Scoring and Dashboards from jake abma on 2020-05-13 (public-silver@w3.org from May 2020)

From: jake abma <jake.abma@gmail.com>
Date: Wed, 13 May 2020 12:41:02 +0200
To: Rachael Montgomery <rachael@accessiblecommunity.org>
Cc: John Foliot <john.foliot@deque.com>, Silver TF <public-silver@w3.org>, WCAG <w3c-wai-gl@w3.org>
Message-ID: <CAMpCG4FOQxdK4my6PTJpGwk_hrncMHRMMCrSPP7Hc3DGYC5frA@mail.gmail.com>
Clear info John,

So to simplify this, what about a starting approach like:

- All re-used / included test results need to be re-evaluated if older than
a year and only be used if they are still actual (like a car / elevator
check)
- Constant small changes need to be judged in every new conformance claim
and if they are responsible for re-designs in time periods shorter than a
year, the re-evaluation needs to be shorter, like every quarter
- Tests need to be based on / checked for the technology present / used and
available at the time of the conformance claim
- After re-designs claims are not valid anymore and need to be done again
for a new conformance claim
- Application with lots of changes in a short amount of time like blogs
etc. need special attention like an exception or special approach to be
discussed.

In short, this is all pre-conditions and a burden for the conformance
claimer / tester BEFORE being allowed to add it to the conformance claim.

The claim itself is just based on the time stamp of the claim and the
claimer needs to guarantee the pre-conditions.
This way the scoring is based on the time stamp and the claim that
everything works as mentioned at that time.

Cheers,
Jake

Op di 12 mei 2020 om 18:25 schreef Rachael Montgomery <
rachael@accessiblecommunity.org>:

> John,
>
> If we dictate vs acknowledge depreciation, how do you propose we address
> organization’s’ differences in development and deployment  environments and
> cycles?   What is an appropriate time frame that works for everyone
> (internal, external, agile, waterfall, etc)?
>
> This is the step I personally can’t figure out on anything but an
> organization-by-organization basis.   I understand your reasoning behind
> this conversation but I am not sure how to resolve the variability in a way
> that lets us create normative guidance in this area.
>
> Regards,
>
> Rachael
> On May 12, 2020, 12:00 PM -0400, John Foliot <john.foliot@deque.com>,
> wrote:
>
> Hi Jake and Rachael,
>
> First, I am proposing 'depreciation' and NOT 'deterioration', a subtle but
> important distinction.
>
> Like automobiles (which depreciate over time), I am simply arguing that
> any test, from the simplest mechanical test to the most sophisticated
> cognitive walkthrough / user-path / testing with PwD, will over time be
> increasingly less accurate, and thus less valuable.
>
> To extend the analogy:
>
>    - I have a three year old Honda.
>    - My wife owns a 1998 Ford F-10 pickup.
>    - Both are road-worthy (interesting tid-bit: to renew my license
>    plate, I have to have the vehicles 'smog-tested' every 2 years - why?
>    Because the 2-year old test is no longer relevant AND NEEDS TO BE UPDATED)
>    - The resale value of my Honda is roughly 75% of what I paid for it
>    <https://usedfirst.com/cars/honda/>. The resale value of my wife's
>    pickup? about a thousand bucks
>    <https://www.kbb.com/ford/f150-regular-cab/1998/short-bed/?vehicleid=6490&mileage=142654&modalview=false&intent=trade-in-sell&pricetype=trade-in&condition=good&options=6431637%7Ctrue>
>    (if we're lucky).
>
> EVERYTHING depreciates over time, whether that is an automobile, or
> 'Functional User Test results'. Thus, if those test results contribute to
> the site "score", that diminished value will need to be accounted for as
> part of the larger overall scoring mechanism.
>
> Failing that, the opposite outcome is that those Functional User Tests *will
> be run exactly once*, at project start, and likely never run again: never
> mind that over time the actual user experience may degrade for a variety of
> reasons.
>
> *Failing to stale-date these types of user-tests over time is to
> essentially encourage organizations to not bother re-running those tests
> post launch.*
>
> Now, it could be argued that setting a 'stale date' should remain with the
> legislators, and that is fair (to an extent). Those legislators however are
> looking to us (as Subject Matter Experts) to help them 'figure that out',
> and our ability to help legislators understand our specification and
> conformance models will contribute directly to their uptake (or lack of
> uptake) *BY* those same legislators.
>
>
> *> For a tool that is built in and used in an intranet the speed it
> becomes inaccessible will be much slower (perhaps years) than a public
> website with content updated daily and new code releases every week
> (perhaps days).  *
>
> While the 'intranet tool' (your content) may depreciate at varying rates,
> there is also the relationship (and interaction) between your content and
> the User Agent Stack, which will "age" at the same rate for ALL content.
> (Microsoft's Edge Browser of 2018 is very different from the Edge Browser
> v.2020 for example). Since these types of tests are seeking to measure the
> "ability" of the user to complete or perform a function, it takes both
> content AND tools to achieve that, and how those tools work with the
> content is a critical part of the "ability" calculation. So these tests are
> on both content and tools combined (with shades of "Accessibility
> supported" in there for good measure.)
>
> Use Case/Strawman: A web page with a complex graphic created in 2005
> uses @longdesc to provide the longer textual description, and got a
> 'passing score' because it used an acceptable technique (for the day).
> However in 2020, @longdesc has ZERO support on Apple devices, and so for
> the intended primary audience of that content, they cannot access it today,
> even though it has been provided by the author using an approved technique.
> Do you still agree that the page deserves a passing score in 2020, because
> it had a passing score in 2005? (If no, why not?)
>
>
> *> How a tool vendor places it in a dashboard is totally up to the tool
> vendor.*
>
> No argument, and as part of this discussion I've also supplied 11
> different examples
> <https://docs.google.com/document/d/1PgmVS0s8_klxvV2ImZS1GRXHwUgKkoXQ1_y6RBMIZQw/edit#heading=h.wk6s27klxqr7>
> of *HOW* different vendors are dealing with this (each in their own way). I
> no more want to prescribe how vendors communicate the *overall
> accessibility health of a web property *than I do any other specific
> Success Criteria, but via strong anecdotal evidence I've also supplied
> <https://lists.w3.org/Archives/Public/w3c-wai-gl/2020AprJun/0345.html>,
> we know that industry both needs and wants dashboards and a 'running score'
> post launch of any specific piece of web content - and the dashboards
> examples I've provided are the evidence and proof that this is what
> Industry is seeking today (otherwise why would almost every vendor out
> there today be offering a dashboard?)
>
> My concern is that failing to account for depreciation means that our
> scoring system is only applicable as part of the software development life
> cycle (SDLC), but does not work as well in the larger context that industry
> is seeking: ongoing accessibility conformance *monitoring,* which is what
> our industry is asking for.
>
> JF
>
> On Tue, May 12, 2020 at 3:01 AM jake abma <jake.abma@gmail.com> wrote:
>
>> Ow, and we're not the body to judge about when something "deteriorate".
>>
>> If at all, that is up to the tester / auditor OR if wished up to the
>> dashboard maker / settings.
>>
>> Op di 12 mei 2020 om 09:58 schreef jake abma <jake.abma@gmail.com>:
>>
>>> I don't see any issues here by adding dates / time stamps to a
>>> conformance claim.
>>>
>>> - First of all for the specific conformance claim / report
>>> - If other reports are included with another time stamp, mention it
>>> (also the time stamp and which part it is)
>>> - The responsibility is up to the "conformance claimer" if he chooses a
>>> report to include but didn't check if it's still actual.
>>>
>>> We only provide guidance for how to test and score and ask for time
>>> stamps.
>>>
>>> How a tool vendor places it in a dashboard is totally up to the tool
>>> vendor.
>>>
>>> Cheers!
>>> Jake
>>>
>>>
>>>
>>>
>>>
>>> Op ma 11 mei 2020 om 18:10 schreef John Foliot <john.foliot@deque.com>:
>>>
>>>> Hi All,
>>>>
>>>> During our calls last week, the use-case of monitoring conformance
>>>> dashboards was raised.
>>>>
>>>> One important need for *on-going score calculation* will be for usage
>>>> in these scenarios. After a bit of research, it appears that many different
>>>> accessibility conformance tools are today offering this
>>>> feature/functionality already.
>>>>
>>>> Please see:
>>>>
>>>>
>>>> https://docs.google.com/document/d/1PgmVS0s8_klxvV2ImZS1GRXHwUgKkoXQ1_y6RBMIZQw/edit?usp=sharing
>>>>
>>>> ...for examples that I was able to track down. (Note, some examples
>>>> today remain at the page level - for example Google Lighthouse - whereas
>>>> other tools are offering composite or aggregated views of 'sites' of at
>>>> least 'directories' [sic].)
>>>>
>>>> It is in scenarios like this that I question the 'depreciation' of
>>>> user-testing scores over time (in the same way that new cars depreciate
>>>> when you drive them off the lot, and continue to do so over the life of the
>>>> vehicle).
>>>>
>>>> Large organizations are going to want up-to-date dashboards, which
>>>> mechanical testing can facilitate quickly, but the more complex and
>>>> labor-intensive tests will be run infrequently over the life-cycle of a
>>>> site or web-content, and I assert that this infrequency will have an impact
>>>> on the 'score': user-test data that is 36 months old will likely be 'dated'
>>>> over that time-period, and in fact may no longer be accurate.
>>>>
>>>> Our scoring mechanism will need to address that situation.
>>>>
>>>> JF
>>>> --
>>>> *John Foliot* | Principal Accessibility Strategist | W3C AC
>>>> Representative
>>>> Deque Systems - Accessibility for Good
>>>> deque.com
>>>> "I made this so long because I did not have time to make it shorter." -
>>>> Pascal
>>>>
>>>>
>>>>
>>>>
>
> --
> *John Foliot* | Principal Accessibility Strategist | W3C AC
> Representative
> Deque Systems - Accessibility for Good
> deque.com
> "I made this so long because I did not have time to make it shorter." -
> Pascal
>
>
>
>
Received on Wednesday, 13 May 2020 10:41:31 UTC