Re: Auto-WCAG - Expert system approach? from Frank Berker on 2016-10-26 (public-auto-wcag@w3.org from October 2016)

From: Frank Berker <fb@ftb-esv.de>
Date: Wed, 26 Oct 2016 13:00:32 +0200
To: public-auto-wcag@w3.org
Message-ID: <6c63a90f-c1e5-87b6-3a32-49827d86a84e@ftb-esv.de>
Hi Alistair, John and all,

we faced the issue of the need of human input for valid conformance 
testing quite early:
RE: Human input in the Auto-WCAG test-cases 
<https://lists.w3.org/Archives/Public/public-auto-wcag/2014Oct/0028.html>

And we came to the decision that we should try to provide the full test 
chain avoiding "earl:cantTell"-outcomes. This also led to the 
development of the UTT-bookmarklet.

It was planned to keep the choice at the developers, if they implement 
just the automatic parts or the full specification. We also specified 
the knowledge, which an assertor must have to provide a reliable assessment:
Assertor requirements (optional) 
<https://www.w3.org/community/auto-wcag/wiki/Introduction_to_auto-wcag_test_design#Assertor_requirements_.28optional.29>

Besides, human input is also included in the test procedure of some 
sufficient techniques.
/"Procedure//
////
//Examine each img element in the content//
//Check that each img element which conveys meaning contains an alt 
attribute.//
//If the image contains words that are important to understanding the 
content, the words are included in the text alternative." /H37: Using 
alt attributes on img elements 
<https://www.w3.org/WAI/GL/2016/WD-WCAG20-TECHS-20160105/H37>

Even if its clear, to underline, take the 1.1.1 example: If all images 
on a page contain alt-attributes with the value "Lorem ipsum", this must 
fail. Without human input, a non-empty textual alternative can only be 
"earl:cantTell". And lots of tools provide lots of such outcomes, which 
force an evaluator to have a look on each image. This could and should 
be supported by a tool.

According to the Activity diagram of transitions between steps 1 to 18 
<https://www.w3.org/community/auto-wcag/wiki/SC1-1-1-text-alternative#Activity_diagram_of_transitions_between_steps_1_to_18>, 
fail1, pass3/fail4, pass4, fail6 and pass6/fail7 can be assessed fully 
automatically.

But as far as I read, how user testing and automated testing come 
together in ACT Rules will be a topic in the spec:
ACT Framework Spec 
<https://www.w3.org/WAI/GL/task-forces/conformance-testing/wiki/ACT_Deliverables#1._ACT_Framework_Spec>

So it definitely will have to be discussed.

Best regards
Frank

Am 26.10.2016 um 10:50 schrieb John Hicks:
> Hello
> Apologies if I misconstrued the question!
>
> On a related issue, in the github we often see "Ruletype : automatic" 
> but then in the steps "Get user input"
> For me an automatic rule (as well as an "expert system") is always 
> without user input (once it's running, of course).
>
> Maybe some definitions woudl be useful ; as far as I understand it a 
> user-input question/answer system is not an "Expert System" in the 
> traditional AI sense.
>
> 2nd the call for clarification from the others
>
> John
>
>
>
>
> On 26 October 2016 at 10:16, Alistair Garrison 
> <alistair.garrison@ssbbartgroup.com 
> <mailto:alistair.garrison@ssbbartgroup.com>> wrote:
>
>     Hi John, All,
>
>     The fundamental shift over to expert-system type “rules” also came
>     with the development of
>     https://www.w3.org/community/auto-wcag/wiki/Template:UserInput
>     <https://www.w3.org/community/auto-wcag/wiki/Template:UserInput>.
>     This shift would not, almost certainly, have been made based on a
>     mentioned reference to a tool by a participant.  Certainly, for a
>     shift to take place group consensus must have been sort – about
>     moving to expert-system type rules, over the more atomic rules
>     people had been previously working on (for example, in the EIII
>     project’s http://checkers.eiii.eu/).
>
>     Mikael mentioned the “EIII project’s User Testing Tool”, which
>     could well have been the expert-system being referenced - but this
>     is very much in its infancy, and seemingly no longer in development.
>
>     So then, if no one in the group is currently developing an
>     expert-system (is anyone?); and we have several in the group
>     developing tools that run more atomic tests (like the EIII tests)
>     - my question to the group is still “are we developing Auto-WCAG
>     rules for an expert system tool”? and, if yes – why exactly?
>
>     Especially when all my personal experience to date, and seemingly
>     John’s personal experience (with regard to the first bullet),
>     indicates that such expert-system interview-based tools:
>
>     -become so very tedious so quickly to use;
>
>     -take ages to develop (as “rules” are so inter-dependent);
>
>     -will almost certainly not fit the broad-scale auto-monitoring
>     usecase, as they require user interaction at many interim stages
>     in tests; and
>
>     -are only as accurate as the user’s judgement – which was surely
>     what we’re trying to avoid by having a base set of agreed fully
>     automatic tests we could compare results from.
>
>     Again, very interested to hear from Wilco / group members.
>
>     All the best
>
>     Alistair
>
>     Alistair Garrison
>
>     Senior Accessibility Engineer
>
>     SSB Bart Group
>
>     *From: *John Hicks <jwjhix@gmail.com <mailto:jwjhix@gmail.com>>
>     *Date: *Tuesday, 25 October 2016 at 13:24
>     *To: *Alistair Garrison <alistair.garrison@ssbbartgroup.com
>     <mailto:alistair.garrison@ssbbartgroup.com>>
>     *Cc: *"public-auto-wcag@w3.org <mailto:public-auto-wcag@w3.org>"
>     <public-auto-wcag@w3.org <mailto:public-auto-wcag@w3.org>>
>     *Subject: *Re: Auto-WCAG - Expert system approach?
>
>     Dear Alistair,
>
>     I am not sure exactly which meeting it was or if it referred to
>     something I might have said :
>
>     Urbilog.fr has developped 3 automatic testing tools based on
>     Expert System in the AI sense.
>
>     The "Expert" part was not user input but the "clips" expert system
>     as created by nasa : https://en.wikipedia.org/wiki/CLIPS
>     <https://en.wikipedia.org/wiki/CLIPS>
>
>     The essence of the idea is to transform each HTML element into a
>     statement in a declarative language and then test the truth of
>     these statements w.r.t. the rule-set in question (508 and Wcag 1
>     in the beginning, RGAA later).
>
>     One of these tools also had the question and answer part, which,
>     as you say becomes so very tedious so quickly.
>
>     I am still hoping to get one of these applications, designed for
>     IBM initially, but to which the IP rights belong to the
>     developper, into open source.
>
>     John
>
>     On 25 October 2016 at 14:07, Alistair Garrison
>     <alistair.garrison@ssbbartgroup.com
>     <mailto:alistair.garrison@ssbbartgroup.com>> wrote:
>
>         Hi Wilco, All,
>
>         In the July 2015 Auto-WCAG blog -
>         https://www.w3.org/community/auto-wcag/2015/07/24/introducing-the-auto-wcag-user-input-template/
>         <https://www.w3.org/community/auto-wcag/2015/07/24/introducing-the-auto-wcag-user-input-template/>,
>         under Next steps I was reading that:
>
>         “Some participants of the auto-wcag community group are
>         currently implementing the prototype of a User Testing Tool
>         based on the questions developed in the structured approach
>         described in this post. The tool runs in the user’s web
>         browser and connects to a database storing the user input.”
>
>         Out of interest, could I ask which participants are working on
>         this “expert-system” tool? And, if work is still under way?
>
>         I too developed an interview based expert-system ages ago –
>         for testing the accessibility of a web page (thankfully they
>         were more static back then).
>
>         With all such systems you call your tests “rules”, and you
>         follow a very similar grammar to the one proposed in
>         Auto-WCAG.  I used Jess formatting initially
>         (http://herzberg.ca.sandia.gov/
>         <http://herzberg.ca.sandia.gov/>), then developed my own system…
>
>         I finalised my expert system some years ago – it looked at
>         WCAG 1.0 AA.  I demoed it to several organisations, and got
>         some good reviews!
>
>         The issue was that although an interesting way to proceed –
>         only when you actually used it for commercial audits did you
>         realize how slow such as process is. The same questions have
>         to be asked again and again of the user – for example, for
>         each img node – which is overkill if you are only looking to
>         find enough faults to show something is an issue.
>
>         For example,
>         http://wilcofiers.github.io/auto-wcag/rules/SC1-1-1-text-alternative.html
>         <http://wilcofiers.github.io/auto-wcag/rules/SC1-1-1-text-alternative.html>
>         - Contains questions you need to ask the user about each image
>         – “Is this element solely for decorative purposes”?
>
>         With actual implementation knowledge, it is certainly not an
>         approach I would suggest for large-scale monitoring purposes,
>         as it simply takes too long to assess each page looked at; and
>         requires human judgement which can be wildly different.
>           Auto-WCAG tests, being formatted in a very specific way,
>         also will not slip easily into other testing platforms.
>
>         My understanding was that we were concentrating on developing
>         fully automatic tests – which could be plugged into any
>         testing platform – the output from which could easily be compared.
>
>         With manual steps in a number of the current tests, which also
>         include design constraints such as “Presented item - Web page
>         (with title either highlighted or in a seperate textbox)”, I
>         think we are making it hard for ourselves to achieve the
>         comparability goal; or even create tests that achieve
>         AUTO-WCAG’s desired aims.
>
>         It would only take a short amount of time to re-assemble the
>         current “rules” into sets of atomic fully-automated tests – by
>         leaving the manual testing steps aside; and I wonder if this
>         isn’t the direction we should be moving in instead – and may
>         prove significantly quicker.  Which, I also should mention
>         seems to have been the approach of the EIII project from which
>         Auto-WCAG was initially born
>         (http://checkers.eiii.eu/en/tests/
>         <http://checkers.eiii.eu/en/tests/>).
>
>         My question to the group is “are we developing Auto-WCAG rules
>         for an expert system tool”? and, if yes – why exactly?
>
>         I’d be very interested to discuss the above, and hear comments
>         from the whole group.
>
>         All the best
>
>         Alistair
>
>         ---
>
>         Alistair Garrison
>
>         Senior Accessibility Engineer
>
>         SSB Bart Group
>
>


-- 
Frank Berker
fb@ftb-esv.de                       http://ftb-esv.de
FTB - Forschungsinstitut Technologie und Behinderung
Grundschötteler Strasse 40,             58300 Wetter
Telefon: 02335/9681-34        Telefax: 02335/9681-19
Received on Wednesday, 26 October 2016 11:01:13 UTC