Re: Scoring and Dashboards from jake abma on 2020-05-13 (public-silver@w3.org from May 2020)

From: jake abma <jake.abma@gmail.com>
Date: Wed, 13 May 2020 12:51:53 +0200
To: John Foliot <john.foliot@deque.com>
Cc: Rachael Montgomery <rachael@accessiblecommunity.org>, Silver TF <public-silver@w3.org>, WCAG <w3c-wai-gl@w3.org>
Message-ID: <CAMpCG4Hv3HFeE_y+xMs7iuFj7JddsDa8uGRPxxMStsZ8O5Qw4A@mail.gmail.com>
Hi John / Rachael,

See this one and matches my previous mail.

This is already how this is required for the public sector in Europe
(government and municipalities that have to inspect and rapport their
applications every "X" period)
So we can look at how this is done and align the needs for claims with that
approach.

Jake


Op di 12 mei 2020 om 18:57 schreef John Foliot <john.foliot@deque.com>:

> Hi Rachael,
>
> > If we dictate vs acknowledge depreciation, how do you propose we address
> organization’s’ differences in development and deployment  environments and
> cycles?
>
> We don't. We simply state that in the W3C's WCAG 3.0 conformance model,
> those higher-order 'user-tests' will need to be re-run after "X" period of
> time *if you want to report using the W3C model. *
>
> I acknowledge that it will likely be an arbitrary decision in some ways,
> but I think that collectively we can arrive at a consensus period of time
> (I'll suggest 2 years, based in large part on the fact that our WG has
> previously agreed to publish updates every 2 years, but that is just a
> suggestion).
>
> Presuming that legislators take-up our new specification (which is STILL
> not a given), organizations will adapt to the new reality as part of their
> legal obligations. They can manage that as they see fit, based on their
> ecosystem.
>
> JF
>
> On Tue, May 12, 2020 at 11:25 AM Rachael Montgomery <
> rachael@accessiblecommunity.org> wrote:
>
>> John,
>>
>> If we dictate vs acknowledge depreciation, how do you propose we address
>> organization’s’ differences in development and deployment  environments and
>> cycles?   What is an appropriate time frame that works for everyone
>> (internal, external, agile, waterfall, etc)?
>>
>> This is the step I personally can’t figure out on anything but an
>> organization-by-organization basis.   I understand your reasoning behind
>> this conversation but I am not sure how to resolve the variability in a way
>> that lets us create normative guidance in this area.
>>
>> Regards,
>>
>> Rachael
>> On May 12, 2020, 12:00 PM -0400, John Foliot <john.foliot@deque.com>,
>> wrote:
>>
>> Hi Jake and Rachael,
>>
>> First, I am proposing 'depreciation' and NOT 'deterioration', a subtle
>> but important distinction.
>>
>> Like automobiles (which depreciate over time), I am simply arguing that
>> any test, from the simplest mechanical test to the most sophisticated
>> cognitive walkthrough / user-path / testing with PwD, will over time be
>> increasingly less accurate, and thus less valuable.
>>
>> To extend the analogy:
>>
>>    - I have a three year old Honda.
>>    - My wife owns a 1998 Ford F-10 pickup.
>>    - Both are road-worthy (interesting tid-bit: to renew my license
>>    plate, I have to have the vehicles 'smog-tested' every 2 years - why?
>>    Because the 2-year old test is no longer relevant AND NEEDS TO BE UPDATED)
>>    - The resale value of my Honda is roughly 75% of what I paid for it
>>    <https://usedfirst.com/cars/honda/>. The resale value of my wife's
>>    pickup? about a thousand bucks
>>    <https://www.kbb.com/ford/f150-regular-cab/1998/short-bed/?vehicleid=6490&mileage=142654&modalview=false&intent=trade-in-sell&pricetype=trade-in&condition=good&options=6431637%7Ctrue>
>>    (if we're lucky).
>>
>> EVERYTHING depreciates over time, whether that is an automobile, or
>> 'Functional User Test results'. Thus, if those test results contribute to
>> the site "score", that diminished value will need to be accounted for as
>> part of the larger overall scoring mechanism.
>>
>> Failing that, the opposite outcome is that those Functional User Tests *will
>> be run exactly once*, at project start, and likely never run again:
>> never mind that over time the actual user experience may degrade for a
>> variety of reasons.
>>
>> *Failing to stale-date these types of user-tests over time is to
>> essentially encourage organizations to not bother re-running those tests
>> post launch.*
>>
>> Now, it could be argued that setting a 'stale date' should remain with
>> the legislators, and that is fair (to an extent). Those legislators however
>> are looking to us (as Subject Matter Experts) to help them 'figure that
>> out', and our ability to help legislators understand our specification and
>> conformance models will contribute directly to their uptake (or lack of
>> uptake) *BY* those same legislators.
>>
>>
>> *> For a tool that is built in and used in an intranet the speed it
>> becomes inaccessible will be much slower (perhaps years) than a public
>> website with content updated daily and new code releases every week
>> (perhaps days).  *
>>
>> While the 'intranet tool' (your content) may depreciate at varying rates,
>> there is also the relationship (and interaction) between your content and
>> the User Agent Stack, which will "age" at the same rate for ALL content.
>> (Microsoft's Edge Browser of 2018 is very different from the Edge Browser
>> v.2020 for example). Since these types of tests are seeking to measure the
>> "ability" of the user to complete or perform a function, it takes both
>> content AND tools to achieve that, and how those tools work with the
>> content is a critical part of the "ability" calculation. So these tests are
>> on both content and tools combined (with shades of "Accessibility
>> supported" in there for good measure.)
>>
>> Use Case/Strawman: A web page with a complex graphic created in 2005
>> uses @longdesc to provide the longer textual description, and got a
>> 'passing score' because it used an acceptable technique (for the day).
>> However in 2020, @longdesc has ZERO support on Apple devices, and so for
>> the intended primary audience of that content, they cannot access it today,
>> even though it has been provided by the author using an approved technique.
>> Do you still agree that the page deserves a passing score in 2020, because
>> it had a passing score in 2005? (If no, why not?)
>>
>>
>> *> How a tool vendor places it in a dashboard is totally up to the tool
>> vendor.*
>>
>> No argument, and as part of this discussion I've also supplied 11
>> different examples
>> <https://docs.google.com/document/d/1PgmVS0s8_klxvV2ImZS1GRXHwUgKkoXQ1_y6RBMIZQw/edit#heading=h.wk6s27klxqr7>
>> of *HOW* different vendors are dealing with this (each in their own way). I
>> no more want to prescribe how vendors communicate the *overall
>> accessibility health of a web property *than I do any other specific
>> Success Criteria, but via strong anecdotal evidence I've also supplied
>> <https://lists.w3.org/Archives/Public/w3c-wai-gl/2020AprJun/0345.html>,
>> we know that industry both needs and wants dashboards and a 'running score'
>> post launch of any specific piece of web content - and the dashboards
>> examples I've provided are the evidence and proof that this is what
>> Industry is seeking today (otherwise why would almost every vendor out
>> there today be offering a dashboard?)
>>
>> My concern is that failing to account for depreciation means that our
>> scoring system is only applicable as part of the software development life
>> cycle (SDLC), but does not work as well in the larger context that industry
>> is seeking: ongoing accessibility conformance *monitoring,* which is
>> what our industry is asking for.
>>
>> JF
>>
>> On Tue, May 12, 2020 at 3:01 AM jake abma <jake.abma@gmail.com> wrote:
>>
>>> Ow, and we're not the body to judge about when something "deteriorate".
>>>
>>> If at all, that is up to the tester / auditor OR if wished up to the
>>> dashboard maker / settings.
>>>
>>> Op di 12 mei 2020 om 09:58 schreef jake abma <jake.abma@gmail.com>:
>>>
>>>> I don't see any issues here by adding dates / time stamps to a
>>>> conformance claim.
>>>>
>>>> - First of all for the specific conformance claim / report
>>>> - If other reports are included with another time stamp, mention it
>>>> (also the time stamp and which part it is)
>>>> - The responsibility is up to the "conformance claimer" if he chooses a
>>>> report to include but didn't check if it's still actual.
>>>>
>>>> We only provide guidance for how to test and score and ask for time
>>>> stamps.
>>>>
>>>> How a tool vendor places it in a dashboard is totally up to the tool
>>>> vendor.
>>>>
>>>> Cheers!
>>>> Jake
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> Op ma 11 mei 2020 om 18:10 schreef John Foliot <john.foliot@deque.com>:
>>>>
>>>>> Hi All,
>>>>>
>>>>> During our calls last week, the use-case of monitoring conformance
>>>>> dashboards was raised.
>>>>>
>>>>> One important need for *on-going score calculation* will be for usage
>>>>> in these scenarios. After a bit of research, it appears that many different
>>>>> accessibility conformance tools are today offering this
>>>>> feature/functionality already.
>>>>>
>>>>> Please see:
>>>>>
>>>>>
>>>>> https://docs.google.com/document/d/1PgmVS0s8_klxvV2ImZS1GRXHwUgKkoXQ1_y6RBMIZQw/edit?usp=sharing
>>>>>
>>>>> ...for examples that I was able to track down. (Note, some examples
>>>>> today remain at the page level - for example Google Lighthouse - whereas
>>>>> other tools are offering composite or aggregated views of 'sites' of at
>>>>> least 'directories' [sic].)
>>>>>
>>>>> It is in scenarios like this that I question the 'depreciation' of
>>>>> user-testing scores over time (in the same way that new cars depreciate
>>>>> when you drive them off the lot, and continue to do so over the life of the
>>>>> vehicle).
>>>>>
>>>>> Large organizations are going to want up-to-date dashboards, which
>>>>> mechanical testing can facilitate quickly, but the more complex and
>>>>> labor-intensive tests will be run infrequently over the life-cycle of a
>>>>> site or web-content, and I assert that this infrequency will have an impact
>>>>> on the 'score': user-test data that is 36 months old will likely be 'dated'
>>>>> over that time-period, and in fact may no longer be accurate.
>>>>>
>>>>> Our scoring mechanism will need to address that situation.
>>>>>
>>>>> JF
>>>>> --
>>>>> *John Foliot* | Principal Accessibility Strategist | W3C AC
>>>>> Representative
>>>>> Deque Systems - Accessibility for Good
>>>>> deque.com
>>>>> "I made this so long because I did not have time to make it shorter."
>>>>> - Pascal
>>>>>
>>>>>
>>>>>
>>>>>
>>
>> --
>> *John Foliot* | Principal Accessibility Strategist | W3C AC
>> Representative
>> Deque Systems - Accessibility for Good
>> deque.com
>> "I made this so long because I did not have time to make it shorter." -
>> Pascal
>>
>>
>>
>>
>
> --
> *John Foliot* | Principal Accessibility Strategist | W3C AC
> Representative
> Deque Systems - Accessibility for Good
> deque.com
> "I made this so long because I did not have time to make it shorter." -
> Pascal
>
>
>
>
Received on Wednesday, 13 May 2020 10:52:19 UTC