[Note to submitters: Do not add to or change the document style; styles will be removed prior to publication. Ensure that your contribution is accessible (markup headings, paragraphs, lists, tables, citations, acronyms, and other document structures, and provide text alternatives for illustrations, graphics, and other images and non-text content; please refer to How To Meet WCAG 2.0 for more information); inaccessible contributions cannot be accepted. Do not remove the following paragraph:]
This is a submission for the RDWG Symposium on Website Accessibility Metrics. It has not yet been reviewed or accepted for publication. Please refer to the RDWG Frequently Asked Questions (FAQ). for more information about RDWG symposia and publications.
Templates are highly used in Web development. There are estimates indicating that 40-50% of the Web content uses templates[1]. Automatic accessibility tools report errors and warnings on pages built on those templates, i.e. pages where the original templates were already merged and filled with the specific web content. Consequently, if a template presents an accessibility problem, that problem is reported as often as the template is used within the page or, at the site level, within the site.
From final users or page/site accessibility evaluation perspectives that number of problems corresponds to the page/site quality. However, from the developers’ perspective standard accessibility evaluation tools provide obfuscating results. The same error is constantly repeated, producing unnecessarily long reports that confuse developers[2], concealing fundamental repairing issues.
Metrics for accessibility evaluation have the same issue. A bad result, deriving from a large number of errors in a page or site assessment, may actually be consequence of a small number of problems in a frequently used template that can be rapidly corrected. Therefore, from a developer's perspective common accessibility metrics may be misleading.
We can decompose our research question in:
Template detection is often used in the fields of Information Retrieval and page Clustering. Towards IR, template detection and removal increase precision [3] can positively impact the performance and resource usage in processes of analysis of HTML pages[4].
Although most work on templates ignores accessibility, its use was already proposed as a mean to improve it[5]. In fact, if templates are made accessible and widely used, there is a lower probability of having inaccessible pages. Otherwise, the possible errors will propagate, causing the previously mentioned issues.
Metrics, such as UWEM[6] and WAQM[7], are invaluable to assess page accessibility. These provide different perspectives of the accessibility quality. However, none directly addresses the developers' efforts in relation to the common development process. Templates are fundamental for this process and must be considered.
To identify templates, we propose the use of a simple algorithm to identify common elements amongst the HTML DOM trees - Fast Match algorithm[8]. Although common elements may not coincide with the template, they offer a reasonable estimate for initial assessment. Then, we modified QualWeb, an automatic accessibility evaluator[9], to optionally consider the algorithm. With the template mode on, the tool accepts a set of pages, identifies common nodes and evaluates the nodes using WCAG 2.0 techniques.
To address the first part of the research question, we prepared a simple study comparing each Web page with its home page, to identify common elements. The results provide an indication of the percentage of accessibility issues that are detected on templates shared between those two pages. For a deeper analysis one should consider within-page templates and templates across several other pages of the same site.
To address reporting, we modified QualWeb to accommodate the template awareness. Reporting is aggregated in two sets: template (common nodes) and specific (nodes unique on a page). This way, problems (errors and warnings) are reported only once if they occur in similar nodes. On the template set each reported problem indicates the number of occurrences (of the common node/template).
To address the accessibility quality of a page/site, for a developer, we combine these sets’ assessment as follows:
The first equation indicates the accessibility problems of a page - Α(pi) -, combining the number of problems on the template set - αt - with those on the specific part of the page - αs(pi).The second equation applies to a site and thus sums the specific part for each page.
The study for assessing the impact of templates was also used to produce an initial quantitative assessment of reporting and metrics. We selected four representative Web sites - Google, Wikipedia, Facebook and Amazon - two Portuguese newspapers - DN and Publico - and WordPress.
Major limitations relate to template detection:
The results show the percentage of template vs. specific outcomes from WCAG 2.0 techniques (i.e., pass, warn or fail), considering all seven sites. The average for the template set is 38.85% (σ = 7.48). Of those, 34.5% (σ = 7.0) were warnings and 0.8% (σ = 1.0) were fails. Therefore, about 35% of issues would be addressed twice for repairing, if templates were not considered.
w/o templates | αt | i=0∑n[αs(pi)] | Α(S) | |
---|---|---|---|---|
DN | 291476 | 2589 | 211276 | 213865 |
Wordtaps | 320890 | 3245 | 207380 | 210625 |
The results show a significant decrease on the scores. In column 2 (w/o templates) we simply add the reported accessibility problems of each page of the site. Collum 5 (Α(S)) reflects the scores of the propose metric, i.e., problems that should be addressed by the developer. Clearly, developers would be misled if standard metrics were considered.
Open research avenues are many, but we can identify three main directions: more accurate template detection (and backtracking); intra-page and extra-page templates; and developers’ assessment.
This work was funded by Fundação para a Ciência e Tecnologia (FCT) through the QualWeb national research project PTDC/EIA-EIA/105079/2008, the Multiannual Funding Programme, and POSC/EU.
[Please use the following format for any citations and references.]