W3C home > Mailing lists > Public > public-silver@w3.org > July 2020

Re: agenda for Silver meeting of 10 July 2020

From: jake abma <jake.abma@gmail.com>
Date: Tue, 14 Jul 2020 11:43:04 +0200
Message-ID: <CAMpCG4E5qjQ1cDChZRZxdE2W8-EBZPr-HmvYw6=bDbbM3x3YvQ@mail.gmail.com>
To: Jeanne Spellman <jspellman@spellmanconsulting.com>
Cc: Silver Task Force <public-silver@w3.org>
Hi Rachael / all,

Hereby some comments already for the proposal.
Please excuse me if some comments are a bit bluntly in wording and might
need some explanation from my side, but they are just some questions
popping up when going through your great work.

1. Adjectival Rating

·      *Agreed*
2. Disclaimer

·      *Agreed*
3. Ideas Incorporated

·      *Mostly agreed*

·      Path/task based conformance

o   *how is this incorporated, I don't see it yet?*

·      Incorporate usability testing

o   *how is this incorporated, I don't see it yet?*

·      Current small SC (language in page) need to be balances with current
large SC (text alternatives)

o   *What do you mean with this?*

·      Substantial conformance

o   *What do you mean with this?*
4. Declaring Scope

·      *Agree IF my conclusion is correct*

Conformance is defined for paths

·      “Path - A single view or the complete series of views needed to
complete a task from end-to-end. “

·      “View - All content visually and programmatically available without
an interaction equivalent to loading a new page”

o   *So, elements on a page are in scope although not related / needed for
the Task? In other words, we, kind of, replace 'Web Page' with 'view' as
we're *
5. Documentation Hierarchy

·      *Not agreed*

Functional Categories

·      High level grouping of functional needs

o   *Do we mean here the EN 301 549 kind of grouping?*

Guidelines (Functional Outcome)

·      Score each guideline based on tests between 1 and 5

o   *A guideline is not a functional outcome, there may be 2,3,4 or even
more Functional outcomes in a guideline, all with their own functional
needs, own testing and multiple methods closing gaps not always in line
with each other making scoring much, much more complex, see:*



·      Guidelines have a many:many relationship with functional categories

o   *Isn’t it a One-to-Many? One guideline can be the habitat of lots of
functional needs.*

o   *And IF there is more than one functional outcome, isn’t it One-to-Many
for the outcomes: One functional outcome can be the habitat of lots of
functional needs. (all under one guideline)*


·      3 types of tests

o   Level 1: Tests that would fit within the 2.x structure (automated,
manual, page based)

§  *We now replace page based with ‘view’? And to explain this to an
external person this means the same in our new approach?*

o   Level 2a: Tests that require context to evaluate or are harder to meet

§  *We have context in 2.x structure, what is the difference here?*

§  *Harder to meet, what does that mean here?*

§  *Why is this another type of test?*

o   Level 2b: Usability or AT testing

§  *A type of test Usability OR AT, aren’t those 2 completely different
kind of tests?*

§  *Usability is not adjectival rating, what kind of usability tests are we
talking about here? Benchmarking?*

§  *AT Testing, that is not a goal but a means, it helps but is not needed*

Tests have a many:many relationship with Guidelines

·      *What do you mean with this?*
6. Scoring Process

·      2. Run all level 1 tests for all views within a path

o   Automated AND manual (and not page but VIEW based…)

o   *I think we’re done here as manual contains all other suggested tests
IF we demand them*

o   *Run all tests, what does that mean? Where are the tests? In the
Methods? What IF the method is not present OR if multiple methods are
present to solve an issue? Which one to choose?*

·      4. Note the % tests passed for each view (total passed/total in view)

o   *This might be simple for some but hard for others, maybe impossible…*

o   *Think of a very large / dynamic page with lots of text in all kind of
different places (maybe 100, 200, 300… text nodes? Do we want to count them
so we can say how many pass contrast?*

·      5. Note tests that are not applicable

o   *What do you mean here? Many methods, many functional outcomes, not
complete by definition, why mention all tests NOT applicable?*

·      Average all the tests for a guideline for an overall %

o   *This is exactly a BIG challenge, as we need normalization from
COMBINATIONS? BONUS METHODS? AND… personalization kind of methods… see also
again: *

·      9. If average score = 3, run level 2a and/or 2b tests

o   *First of all I do not see this need of test 1 before 2, there is not a
clear need or rationale*

o   *I don’t see this work in practice yet, we need a large elaborated
example with lots of data and relationships, normalization etc. of scoring
before we can make such a call*

·      10, 11, 12

o   *Also for the next steps I don’t see this work in practice yet, we need
a large elaborated example with lots of data and relationships,
normalization etc. of scoring before we can make such a call*

o   *The example provided in the spreadsheet does not work in practice as
it is not granular and mature enough and can not be used per element

o   *The example:*
also a all-or-nothing statement and this is not how views/pages are
constructed, we need a more granular, realistic, approach and this shows
the difficulty of the scoring:*

§  *Here are two elaborated examples with challenges…*


7. Conformance

·      *We have a lot to discuss before I can get to this point*

·      *First solve scoring in more details*
8. Functional Categories

·      *We need the next level of Functional outcomes and the granularity
of how to apply methods to them*

·      *In other words, a more clear view on the === user need / functional
outcome / guideline === structure*

·      *See: *
9. Sample Page

·      Not sure what this slide explains, do we have an example of the page
AND scoring in detail?
10. History & Notes 11. Notes from Testing

In order to get a consistent % passed, the tests will need to be more
granular than current SC and clearly define what is counted as an “item”
against the %

·      ACT tests will be very important to this approach

o   *What do you mean here? 2.x tests? And are they not granular enough? Or
do you mean adding a adjectival rating score?*

o   *ACT tests ARE 2.x, so do you want them to contain adjectival rating?
OR more ACT tests? They are set-up to be most objective (but not always

o   *Would like to see examples of what you mean, hard to imagine from the
theoretical text*

Tests will need to also need to be outcome based but specific so they can
be mapped to guidelines

·      If we can organize the structure so that tests only map to a single
guideline (vs a many to many relationship, this will be simpler and easier)

o   *Tests for guidelines or methods or functional outcomes???*

o   *Tests can be replacements for others / one test for a method may be
enough so other tests are not needed anymore…!*

Functional categories should be distinct and applicable

·      I’ve combined Mobility and Motor

·      Is Independence its own category?

·      Need to create the guidelines and tests and then can finalize the

o   *We don’t use functional categories for testing, they are present in
the background and you can / may filter on them, but do not play a part in
12. Data Model

·      *Can you explain?*
13. Adjectival Ratings – Example

·      *This example is a all-or-nothing example with lots of gaps to be
filled, worked on it a lot and can share results, in its current form can’t
be used for testing*

·      *A more granular example already available and lots of issues pop-up
if you start to test, it will be a start / beginning for discussion on how
to fill in the gaps…*


14. Documentation Hierarchy

·      Guideline/Functional Outcome (Scoring is handled at this level)

o   *Not sure after lot’s of tests… The methods contain the score and there
are flavors to them making normalization needed.*

·      1. Tests that do not require a judgement call (Yes/No)

o   *IF we have / apply adjectival rating, will we / do we want a baseline?*

o   *If we want a baseline, we ALWAYS have a kind of yes/no to start with*

·      2.Tests  with easy judgement call (A/B)

o   Example: Should the image be alt=“” or alt=“[some text]”?

o   The example given here is not easy… I remember well a discussion with
Jon Avila on where he mentioned he wants alt text and I’ve created them as
empty: https://a11yportal.com/

·      5. Usability testing and testing with AT

o   Example: Do JAWS and NVDA users understand the alternative language
when completing tasks?

§  *Already mentioned this above:*

§  *A type of test Usability OR AT, aren’t those 2 completely different
kind of tests?*

§  *Usability is not adjectival rating, what kind of usability tests are we
talking about here? Benchmarking?*

§  *AT Testing, that is not a goal but a means, it helps but is not needed*
15. Structure

·      *(this needs a clear elaborated example, not a theory of some
solutions we can thinks of, the devil is in the details…)*

·      *Is this an example of a guideline?*

·      *If so, there is no absolute relationship and order needed for the
1, 2 ,3, 4, and 5s.*

·      *Visually and programmatic need to be separated and tested for clear
reasons, see my test work (can explain)*

·      *Headings are not necessary per se, what if you use LABELS or
Legends or other ways to structure…?*

·      *Landmarks are not needed, also not technology agnostic*

·      Nr 5.:

o   Headings help users with limited cognition quickly orient to content
and complete tasks

o   Headings help screen reader users quickly navigate content

§  *What does this mean in this slide? Seems like an explanation of
benefits, not test related…*
16. Alternative Text

·      *Have not worked on them*
17. Visual Contrast/Affordances

·      *This is not mature enough to make any judgement call yet*

·      *See my work at: *

·      *Have lots of findings for testing / scoring…*
18. Clear Written Content

·      *Have not worked on them*
19. 20. 21. 22.

·      *Lots of questions and gaps to solve before moving to this stage!*

Op vr 10 jul. 2020 om 17:59 schreef Jeanne Spellman <

> agenda+ Amend Representative Sampling proposal with language for
> transparency
> agenda+ Survey results of Conformance Scope
> agenda+ Rachael's proposal on Scoring
> == Links ==
> Minutes from 7 July meeting
> <https://www.w3.org/2020/07/07-silver-minutes.html#item03>
> Results of Survey on Conformance Scope
> <https://www.w3.org/2002/09/wbs/94845/2020-06_Conformance_Scope/results>
> Slidedeck on Adjectival Rating Proposal
> <https://docs.google.com/presentation/d/1IceTYOyGitApczya4vat4gPk9_I-7hIwpqTtXGusEZk/edit#slide=id.g8208eb709f_2_92>
> and accompanying Spreadsheet
> <https://docs.google.com/spreadsheets/d/1Ctg489tMunn6Yfqc2x-S24WGBz6TDHuyGXBk7Y_PqJI/edit#gid=727657471>
> == Conference Call info ==
> https://www.w3.org/2017/08/telecon-info_silver-fri
> IRC for minutes and notes is at irc.w3.org on channel #silver.

Received on Tuesday, 14 July 2020 09:43:33 UTC

This archive was generated by hypermail 2.4.0 : Thursday, 24 March 2022 20:31:48 UTC