W3C home > Mailing lists > Public > www-validator@w3.org > February 2008

Re: Bug 85/4494 (keeping track of validation statistics for various purposes

From: Brian Wilson <bloo@blooberry.com>
Date: Wed, 6 Feb 2008 09:17:34 -0800 (PST)
To: olivier Thereaux <ot@w3.org>
cc: www-validator@w3.org
Message-ID: <Pine.SUN.4.58.0802060903050.24533@eskimo.com>

On Wed, 6 Feb 2008, olivier Thereaux wrote:

> * stats on the documents themselves. Doctype, mime type, charset.
> Ideally, whether charset is in HTTP, XML decl, meta. There are
> existing studies about these, but another study made on a different
> sample would bring more perspective.

That should be doable.

> * precise values for the error messages. Knowing which type of error
> is "popular" will be very useful, but so would knowing what the
> offending attributes/element/construct. In other words, knowing that
> "unknown attribute" is the #1 error will be great ? knowing that the
> top unknown attributes are frameborder or whatnot will be awesome.

I wanted to do this too, early on. I tried to customize the templating
system myself and intercept the arguments that were being passed via
error_messages.cfg, but I just did not understand the way things were
working. Specifically, in
http://dev.w3.org/cvsweb/validator/share/templates/en_US/error_messages.cfg?rev=1.32&content-type=text/x-cvsweb-markup
I wanted to preserve all the %1, %2, ... arguments (It looks like err #136
has the most arguments at 6). While it seems esoteric and totally
pointless to probably *everyone* else's needs, adding some sort of
abbreviated message of this type to SOAP might be interesting:

            <m:error>
                <m:line>596</m:line>
                <m:col>1169</m:col>
                <m:message>end tag for &quot;UL&quot; which is not
finished</m:message>
                <m:messageid>73</m:messageid>
                <m:messagearg>&quot;UL&quot;</m:messagearg>
                <m:explanation>[stuff deleted]</m:explanation>
                <m:source>[stuff deleted]</m:source>
            </m:error>

where each successive m:messagearg element captures the variable arguments
used in the error message. This is much more compact than storing the
entire error message.

Ignoring (for the moment) whether such a feature addition would be
useful to anyone else's needs, would that be hard to do? If that could be
added, I could grab that information in a future crawl.

-Brian

Brian Wilson --------------------------"Those aren't Sex muffins!   -Coach
bloo@blooberry.com ---------------------Those aren't Love muffins!
http://www.blooberry.com ---------------Those are just BLOOberry muffins!"
Creator of Index DOT Html/Css: http://www.blooberry.com/indexdot/
Received on Wednesday, 6 February 2008 17:17:46 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:28 GMT