- From: Ville Skyttä <ville.skytta@iki.fi>
- Date: Fri, 17 Oct 2008 20:38:13 +0300
- To: www-validator@w3.org
- Cc: Michael Ernst <mernst@alum.mit.edu>
On Friday 17 October 2008, Michael Ernst wrote: > Sometimes, a user expects that checklink will produce certain warnings. > Some reasons include robot exclusion rules, password-protected content, and > errors in automatically-generated content. > > A user would prefer checklink to show only the unexpected warnings, rather > than hiding them in an avalance of uninteresting output. > > This patch adds flags that suppress certain warnings. These flags > complement the existing --exclude and --exclude-docs flags. (The patch > also permits --exclude-docs to be supplied multiple times instead of just > once.) Thanks for the patch! Some comments follow. (I don't mind discussing these things here on the www-validator mailing list, but I think a better suited place would be either the public-qa-dev mailing list or W3C Bugzilla). Because the patch contains two different things (modification of existing exclude-docs functionality, and addition of new options), could you split it into two patches? I hope that's the way it'd also be eventually committed to CVS - it's easier to track changes that way. We can eg. first get the exclude-docs change in, then the rest. The patch appears to drop precompilation and error repoting of the exclude-docs regexp. I don't think that's a good idea for two reasons. First, doing the compilation right at the beginning we get the regexp's syntax checked right there and can abort immediately with a descriptive message instead of running into it later during the check (when the use might no longer be actively watching the link check progress) and barfing with a more obscure error message. Second, precompiling it only once at the beginning is good for performance. Same considerations as above seem to apply to the exclude-redirect-prefix regexps. I think options that can be specified multiple times should be initialized to an empty array ([]) instead of undef, for cleanliness reasons and because that way there's no need to check their definedness later on. I don't like the wildly varying separator characters in option values (->, :, #). Better would be consistent, and we already have the space char used in --masquerade so I suggest using space for the new options as well. In addition to the --help output, bin/checklink.pod in CVS needs to be updated too.
Received on Friday, 17 October 2008 17:38:54 UTC