W3C home > Mailing lists > Public > public-html@w3.org > January 2008

Validation error frequencies

From: Henri Sivonen <hsivonen@iki.fi>
Date: Thu, 31 Jan 2008 14:26:43 +0200
Message-Id: <89C8826C-CEEA-4DEC-8462-98BAE62FF804@iki.fi>
To: HTML Issue Tracking WG <public-html@w3.org>

I ran an analysis on recent error messages from Validator.nu.

The first number is the number of occurrences. The second number is  
the total of distinct URIs that were analyzed.

The analyzed pages were those that users of Validator.nu chose to  
validate. Only errors for public Web pages were logged. Content POSTed  
to Validator.nu is not covered. Pages whose URI contained "/test/" or  
"/tests/" were excluded. URIs and IDs were replaced with "(redacted)"  
before tallying the results. A given error was counted at most once  
per URI, so duplicate errors on one page count only once. Other than  
the "(redacted)" bits, messages are not intelligently consolidated.  
Only (X)HTML5 errors were logged and analyzed. This doesn't not  
include data from the XHTML 1.0 / HTML 4.01 features of Validator.nu.  
The messages are not exactly in the decorated from of the UI: even  
messages pertaining to text/html have the XHTML cruft in them.

Currently people are mainly using the HTML5 features of Validator.nu  
to validate pre-HTML5 content as HTML5. Validator.nu doesn't support  
<font> but supports style='' on every element.

After the December content model change, element containment errors  
are no longer an issue for updating legacy templates. Now the most  
common errors pertain to attributes obsoleted by HTML5 and to spaces  
in IRIs (and to legacy doctypes, of course).

I hope the WG finds this data useful for spec development.

Henri Sivonen
Received on Thursday, 31 January 2008 12:26:57 UTC

This archive was generated by hypermail 2.4.0 : Saturday, 9 October 2021 18:44:26 UTC