Re: Auto-WCAG - Expert system approach?

Dear All,

The diagrams on https://www.w3.org/community/auto-wcag/wiki/SC1-1-1-text-alternative#Activity_diagram_of_transitions_between_steps_1_to_18 indicate how truly complex this way of creating rules has in fact become – interweaving many steps together into giant “rules”. If one step changes; or if new steps need to be added, it may break many things.

Personally, I’d be happiest if Auto-WCAG’s first-base was to look to create all those separate atomic tests that:


1)       Are quick to define; and

2)       Are quick to get group agreement on.

There must be a reasonable number, for example, checking if “one or more iframe nodes does not have a title attribute”. [Not yet discussing the quality of the title text if provided, just if it is present.]

To my mind, this would allow a quick win for Auto-WCAG, as we’d almost certainly be able to step quickly towards at least a low-level of testing parity between tools – whilst providing fully automatic tests for monitoring like those in http://checkers.eiii.eu/en/tests/.


Then we could move on to more complicated (but still atomic tests) like checking if “One or more img nodes (excluding those that have an alt attribute set to a null value, or a role="presentation" attribute), available in the DOM, does not have a mechanism that allows an accessible name value to be calculated”.    The “available in the DOM” aspect means - not impacted by a computed CSS “display:none” style.

These tests are where you could inject guesses about what constitutes a decorative image (e.g. height <= 5px or width <= 3px) – which we could all discuss.  Or, what constitutes suspicious text in an iframe title. Etc…

The quality of the text could be asked about in separate tests that would involve user judgement - really ones that simply collect content for the user to look at.

This more atomic implementation would, again to my mind, allow easier adoption of Auto-WCAG tests into more main-stream products – as some could be adopted into automatic testing engines; and others into semi-automatic content collection / visual testing tools.

All the best

Alistair

Alistair Garrison
Senior Accessibility Engineer
SSB Bart Group



From: Frank Berker <fb@ftb-esv.de>
Date: Wednesday, 26 October 2016 at 12:00
To: "public-auto-wcag@w3.org" <public-auto-wcag@w3.org>
Subject: Re: Auto-WCAG - Expert system approach?
Resent-From: <public-auto-wcag@w3.org>
Resent-Date: Wednesday, 26 October 2016 at 12:01

Hi Alistair, John and all,

we faced the issue of the need of human input for valid conformance testing quite early:
RE: Human input in the Auto-WCAG test-cases<https://lists.w3.org/Archives/Public/public-auto-wcag/2014Oct/0028.html>

And we came to the decision that we should try to provide the full test chain avoiding "earl:cantTell"-outcomes. This also led to the development of the UTT-bookmarklet.

It was planned to keep the choice at the developers, if they implement just the automatic parts or the full specification. We also specified the knowledge, which an assertor must have to provide a reliable assessment:
Assertor requirements (optional)<https://www.w3.org/community/auto-wcag/wiki/Introduction_to_auto-wcag_test_design#Assertor_requirements_.28optional.29>

Besides, human input is also included in the test procedure of some sufficient techniques.
"Procedure

Examine each img element in the content
Check that each img element which conveys meaning contains an alt attribute.
If the image contains words that are important to understanding the content, the words are included in the text alternative." H37: Using alt attributes on img elements<https://www.w3.org/WAI/GL/2016/WD-WCAG20-TECHS-20160105/H37>

Even if its clear, to underline, take the 1.1.1 example: If all images on a page contain alt-attributes with the value "Lorem ipsum", this must fail. Without human input, a non-empty textual alternative can only be "earl:cantTell". And lots of tools provide lots of such outcomes, which force an evaluator to have a look on each image. This could and should be supported by a tool.

According to the Activity diagram of transitions between steps 1 to 18<https://www.w3.org/community/auto-wcag/wiki/SC1-1-1-text-alternative#Activity_diagram_of_transitions_between_steps_1_to_18>, fail1, pass3/fail4, pass4, fail6 and pass6/fail7 can be assessed fully automatically.

But as far as I read, how user testing and automated testing come together in ACT Rules will be a topic in the spec:
ACT Framework Spec<https://www.w3.org/WAI/GL/task-forces/conformance-testing/wiki/ACT_Deliverables#1._ACT_Framework_Spec>

So it definitely will have to be discussed.

Best regards
Frank

Am 26.10.2016 um 10:50 schrieb John Hicks:
Hello
Apologies if I misconstrued the question!

On a related issue, in the github we often see "Ruletype : automatic" but then in the steps "Get user input"
For me an automatic rule (as well as an "expert system") is always without user input (once it's running, of course).

Maybe some definitions woudl be useful ; as far as I understand it a user-input question/answer system is not an "Expert System" in the traditional AI sense.

2nd the call for clarification from the others

John




On 26 October 2016 at 10:16, Alistair Garrison <alistair.garrison@ssbbartgroup.com<mailto:alistair.garrison@ssbbartgroup.com>> wrote:
Hi John, All,

The fundamental shift over to expert-system type “rules” also came with the development of https://www.w3.org/community/auto-wcag/wiki/Template:UserInput.  This shift would not, almost certainly, have been made based on a mentioned reference to a tool by a participant.  Certainly, for a shift to take place group consensus must have been sort – about moving to expert-system type rules, over the more atomic rules people had been previously working on (for example, in the EIII project’s http://checkers.eiii.eu/).

Mikael mentioned the “EIII project’s User Testing Tool”, which could well have been the expert-system being referenced - but this is very much in its infancy, and seemingly no longer in development.

So then, if no one in the group is currently developing an expert-system (is anyone?); and we have several in the group developing tools that run more atomic tests (like the EIII tests) - my question to the group is still “are we developing Auto-WCAG rules for an expert system tool”? and, if yes – why exactly?

Especially when all my personal experience to date, and seemingly John’s personal experience (with regard to the first bullet), indicates that such expert-system interview-based tools:


-          become so very tedious so quickly to use;

-          take ages to develop (as “rules” are so inter-dependent);

-          will almost certainly not fit the broad-scale auto-monitoring usecase, as they require user interaction at many interim stages in tests; and

-          are only as accurate as the user’s judgement – which was surely what we’re trying to avoid by having a base set of agreed fully automatic tests we could compare results from.

Again, very interested to hear from Wilco / group members.

All the best

Alistair

Alistair Garrison
Senior Accessibility Engineer
SSB Bart Group

From: John Hicks <jwjhix@gmail.com<mailto:jwjhix@gmail.com>>
Date: Tuesday, 25 October 2016 at 13:24
To: Alistair Garrison <alistair.garrison@ssbbartgroup.com<mailto:alistair.garrison@ssbbartgroup.com>>
Cc: "public-auto-wcag@w3.org<mailto:public-auto-wcag@w3.org>" <public-auto-wcag@w3.org<mailto:public-auto-wcag@w3.org>>
Subject: Re: Auto-WCAG - Expert system approach?

Dear Alistair,
I am not sure exactly which meeting it was or if it referred to something I might have said :
Urbilog.fr has developped 3 automatic testing tools based on Expert System in the AI sense.

The "Expert" part was not user input but the "clips" expert system as created by nasa :  https://en.wikipedia.org/wiki/CLIPS

The essence of the idea is to transform each HTML element into a statement in a declarative language and then test the truth of these statements w.r.t. the rule-set in question (508 and Wcag 1 in the beginning, RGAA later).
One of these tools also had the question and answer part, which, as you say becomes so very tedious so quickly.
I am still hoping to get one of these applications, designed for IBM initially, but to which the IP rights belong to the developper, into open source.
John



On 25 October 2016 at 14:07, Alistair Garrison <alistair.garrison@ssbbartgroup.com<mailto:alistair.garrison@ssbbartgroup.com>> wrote:
Hi Wilco, All,

In the July 2015 Auto-WCAG blog - https://www.w3.org/community/auto-wcag/2015/07/24/introducing-the-auto-wcag-user-input-template/, under Next steps I was reading that:

“Some participants of the auto-wcag community group are currently implementing the prototype of a User Testing Tool based on the questions developed in the structured approach described in this post. The tool runs in the user’s web browser and connects to a database storing the user input.”

Out of interest, could I ask which participants are working on this “expert-system” tool? And, if work is still under way?
I too developed an interview based expert-system ages ago – for testing the accessibility of a web page (thankfully they were more static back then).

With all such systems you call your tests “rules”, and you follow a very similar grammar to the one proposed in Auto-WCAG.  I used Jess formatting initially (http://herzberg.ca.sandia.gov/), then developed my own system…
I finalised my expert system some years ago – it looked at WCAG 1.0 AA.  I demoed it to several organisations, and got some good reviews!
The issue was that although an interesting way to proceed – only when you actually used it for commercial audits did you realize how slow such as process is.  The same questions have to be asked again and again of the user – for example, for each img node – which is overkill if you are only looking to find enough faults to show something is an issue.

For example, http://wilcofiers.github.io/auto-wcag/rules/SC1-1-1-text-alternative.html - Contains questions you need to ask the user about each image – “Is this element solely for decorative purposes”?

With actual implementation knowledge, it is certainly not an approach I would suggest for large-scale monitoring purposes, as it simply takes too long to assess each page looked at; and requires human judgement which can be wildly different.   Auto-WCAG tests, being formatted in a very specific way, also will not slip easily into other testing platforms.
My understanding was that we were concentrating on developing fully automatic tests – which could be plugged into any testing platform – the output from which could easily be compared.

With manual steps in a number of the current tests, which also include design constraints such as “Presented item - Web page (with title either highlighted or in a seperate textbox)”, I think we are making it hard for ourselves to achieve the comparability goal; or even create tests that achieve AUTO-WCAG’s desired aims.

It would only take a short amount of time to re-assemble the current “rules” into sets of atomic fully-automated tests – by leaving the manual testing steps aside; and I wonder if this isn’t the direction we should be moving in instead – and may prove significantly quicker.  Which, I also should mention seems to have been the approach of the EIII project from which Auto-WCAG was initially born (http://checkers.eiii.eu/en/tests/).

My question to the group is “are we developing Auto-WCAG rules for an expert system tool”? and, if yes – why exactly?

I’d be very interested to discuss the above, and hear comments from the whole group.
All the best
Alistair
---
Alistair Garrison
Senior Accessibility Engineer
SSB Bart Group






--

Frank Berker

fb@ftb-esv.de<mailto:fb@ftb-esv.de>                      http://ftb-esv.de


FTB - Forschungsinstitut Technologie und Behinderung

Grundschötteler Strasse 40,             58300 Wetter

Telefon: 02335/9681-34        Telefax: 02335/9681-19

Received on Wednesday, 26 October 2016 12:33:19 UTC