- From: Brian Wilson <bloo@blooberry.com>
- Date: Wed, 6 Feb 2008 09:17:34 -0800 (PST)
- To: olivier Thereaux <ot@w3.org>
- cc: www-validator@w3.org
On Wed, 6 Feb 2008, olivier Thereaux wrote:
> * stats on the documents themselves. Doctype, mime type, charset.
> Ideally, whether charset is in HTTP, XML decl, meta. There are
> existing studies about these, but another study made on a different
> sample would bring more perspective.
That should be doable.
> * precise values for the error messages. Knowing which type of error
> is "popular" will be very useful, but so would knowing what the
> offending attributes/element/construct. In other words, knowing that
> "unknown attribute" is the #1 error will be great ? knowing that the
> top unknown attributes are frameborder or whatnot will be awesome.
I wanted to do this too, early on. I tried to customize the templating
system myself and intercept the arguments that were being passed via
error_messages.cfg, but I just did not understand the way things were
working. Specifically, in
http://dev.w3.org/cvsweb/validator/share/templates/en_US/error_messages.cfg?rev=1.32&content-type=text/x-cvsweb-markup
I wanted to preserve all the %1, %2, ... arguments (It looks like err #136
has the most arguments at 6). While it seems esoteric and totally
pointless to probably *everyone* else's needs, adding some sort of
abbreviated message of this type to SOAP might be interesting:
<m:error>
<m:line>596</m:line>
<m:col>1169</m:col>
<m:message>end tag for "UL" which is not
finished</m:message>
<m:messageid>73</m:messageid>
<m:messagearg>"UL"</m:messagearg>
<m:explanation>[stuff deleted]</m:explanation>
<m:source>[stuff deleted]</m:source>
</m:error>
where each successive m:messagearg element captures the variable arguments
used in the error message. This is much more compact than storing the
entire error message.
Ignoring (for the moment) whether such a feature addition would be
useful to anyone else's needs, would that be hard to do? If that could be
added, I could grab that information in a future crawl.
-Brian
Brian Wilson --------------------------"Those aren't Sex muffins! -Coach
bloo@blooberry.com ---------------------Those aren't Love muffins!
http://www.blooberry.com ---------------Those are just BLOOberry muffins!"
Creator of Index DOT Html/Css: http://www.blooberry.com/indexdot/
Received on Wednesday, 6 February 2008 17:17:46 UTC