W3C home > Mailing lists > Public > www-validator@w3.org > February 2008

Re: Bug 85/4494 (keeping track of validation statistics for various purposes

From: Brian Wilson <bloo@blooberry.com>
Date: Wed, 6 Feb 2008 09:17:34 -0800 (PST)
To: olivier Thereaux <ot@w3.org>
cc: www-validator@w3.org
Message-ID: <Pine.SUN.4.58.0802060903050.24533@eskimo.com>

On Wed, 6 Feb 2008, olivier Thereaux wrote:

> * stats on the documents themselves. Doctype, mime type, charset.
> Ideally, whether charset is in HTTP, XML decl, meta. There are
> existing studies about these, but another study made on a different
> sample would bring more perspective.

That should be doable.

> * precise values for the error messages. Knowing which type of error
> is "popular" will be very useful, but so would knowing what the
> offending attributes/element/construct. In other words, knowing that
> "unknown attribute" is the #1 error will be great ? knowing that the
> top unknown attributes are frameborder or whatnot will be awesome.

I wanted to do this too, early on. I tried to customize the templating
system myself and intercept the arguments that were being passed via
error_messages.cfg, but I just did not understand the way things were
working. Specifically, in
I wanted to preserve all the %1, %2, ... arguments (It looks like err #136
has the most arguments at 6). While it seems esoteric and totally
pointless to probably *everyone* else's needs, adding some sort of
abbreviated message of this type to SOAP might be interesting:

                <m:message>end tag for &quot;UL&quot; which is not
                <m:explanation>[stuff deleted]</m:explanation>
                <m:source>[stuff deleted]</m:source>

where each successive m:messagearg element captures the variable arguments
used in the error message. This is much more compact than storing the
entire error message.

Ignoring (for the moment) whether such a feature addition would be
useful to anyone else's needs, would that be hard to do? If that could be
added, I could grab that information in a future crawl.


Brian Wilson --------------------------"Those aren't Sex muffins!   -Coach
bloo@blooberry.com ---------------------Those aren't Love muffins!
http://www.blooberry.com ---------------Those are just BLOOberry muffins!"
Creator of Index DOT Html/Css: http://www.blooberry.com/indexdot/
Received on Wednesday, 6 February 2008 17:17:46 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:30:57 UTC