Re: Scoring and Dashboards from John Foliot on 2020-05-22 (public-silver@w3.org from May 2020)

From: John Foliot <john.foliot@deque.com>
Date: Fri, 22 May 2020 15:17:19 -0500
To: Jeanne Spellman <jspellman@spellmanconsulting.com>
Cc: Silver TF <public-silver@w3.org>
Message-ID: <CAKdCpxx5xtk8mzPYgrRy9Z-3wmuA1xh4cETkWEb=zVSirCQ-fA@mail.gmail.com>
Hi Jeanne,

There is no disagreement that different tools will compete on features.
Microsoft Edge and Google's Chrome are both built on the same Chromium
project core, but then they offer different features
<https://blogs.windows.com/windowsexperience/2020/05/21/making-the-web-more-accessible-and-inclusive-for-all-with-microsoft-edge/>,
and thus compete that way.

My point however is that we've publicly decided to use a "Score" based
measurement system, and that scoring mechanism needs to work for both
short-term (development time) and longer-term reporting metrics. This is
why (for example) I keep asking about depreciation of 'scores' over time.

While different dashboards can and will offer different functions, features
and layouts, at the end of the day, with WCAG 3.x all of those tools will
need to be able to use the same WCAG 3.x scoring mechanism, so that in
practice each tool would return the same basic 'score' of any given piece
of content. (Think of it as a MilesPerGallon sticker on a new car - Fords
are different than Toyotas, yet they both use gasoline, so MPG is
relevant across all cars. Equally however we instinctively know that a MPG
sticker on a new car, versus a MPG report of that same car 5 years later
after *no* maintenance done on the car, will most likely return back a
lower MPG).

So to be crystal clear, what I am suggesting is that we, as a group, need
to also define that time-based metric (i.e. the impact of time on test
results) as part of any "Final Score", which is one of the main
deliverables of this TF. Failing to do so would then leave it to individual
organizations to try and figure that out on their own - which is the
antithesis of "standardization".

JF

On Fri, May 22, 2020 at 12:48 PM Jeanne Spellman <
jspellman@spellmanconsulting.com> wrote:

> I completely agree that clients want dashboards and they want to know how
> they compare with themselves over time. No question.
>
> I think the important issue here, is do they want the features of their
> dashboard set as a standard requirement?  I think that all accessibility
> tool makers provide different features and they use the difference between
> those features to both to distinguish themselves competitively and to meet
> the needs of specific industry sectors.
>
> I think we could do harm to the industry if we started writing
> requirements of what was needed in a dashboard and how to do it.
>
> I think our job is to write the accessibility standards that industry
> needs harmonized around the world, without telling tool makers how to build
> their tools and what features to put in them.  I don't think W3C belongs in
> the dashboard business. Deque, Tenon, Site Improve, (to only name a few who
> responded to this thread) are in that business and know best what they want
> to give the customers they serve.  Different industries or  sectors want
> different dashboards.
>
> I would never argue the importance of dashboards and measuring performance
> over time.  I don't think standardization is needed there, and could
> actually stifle innovation.
>
> jeanne
> On 5/11/2020 12:08 PM, John Foliot wrote:
>
> Hi All,
>
> During our calls last week, the use-case of monitoring conformance
> dashboards was raised.
>
> One important need for *on-going score calculation* will be for usage in
> these scenarios. After a bit of research, it appears that many different
> accessibility conformance tools are today offering this
> feature/functionality already.
>
> Please see:
>
>
> https://docs.google.com/document/d/1PgmVS0s8_klxvV2ImZS1GRXHwUgKkoXQ1_y6RBMIZQw/edit?usp=sharing
>
> ...for examples that I was able to track down. (Note, some examples today
> remain at the page level - for example Google Lighthouse - whereas other
> tools are offering composite or aggregated views of 'sites' of at least
> 'directories' [sic].)
>
> It is in scenarios like this that I question the 'depreciation' of
> user-testing scores over time (in the same way that new cars depreciate
> when you drive them off the lot, and continue to do so over the life of the
> vehicle).
>
> Large organizations are going to want up-to-date dashboards, which
> mechanical testing can facilitate quickly, but the more complex and
> labor-intensive tests will be run infrequently over the life-cycle of a
> site or web-content, and I assert that this infrequency will have an impact
> on the 'score': user-test data that is 36 months old will likely be 'dated'
> over that time-period, and in fact may no longer be accurate.
>
> Our scoring mechanism will need to address that situation.
>
> JF
> --
> *John Foliot* | Principal Accessibility Strategist | W3C AC Representative
> Deque Systems - Accessibility for Good
> deque.com
> "I made this so long because I did not have time to make it shorter." -
> Pascal
>
>
>
>

-- 
*John Foliot* | Principal Accessibility Strategist | W3C AC Representative
Deque Systems - Accessibility for Good
deque.com
"I made this so long because I did not have time to make it shorter." -
Pascal
Received on Friday, 22 May 2020 20:18:15 UTC