- From: Michael Ernst <mernst@csail.mit.edu>
- Date: Sun, 8 Feb 2004 14:05:08 -0500
- To: www-validator@w3.org
checklink.pl lacks the ability to ignore parts of a web hierarchy. Ignoring everything under a certain URL can be desirable when it contains a large or infinite number of pages. (As an example of the latter, consider dynamically generated pages that link to other dynamically generated pages.) Using the --depth argument is a partial workaround, but sometimes I wish to check every link under a hierarchy, without seeing any reports for a certain portion of it. The below patch adds this functionality via an --omit option to checklink.pl. -Michael Ernst mernst@csail.mit.edu cd ~/bin/share/ diff -u -b -r /g2/users/mernst/bin/share/checklink.pl-orig /g2/users/mernst/bin/share/checklink.pl --- /g2/users/mernst/bin/share/checklink.pl-orig Fri Feb 6 11:54:10 2004 +++ /g2/users/mernst/bin/share/checklink.pl Sun Feb 8 08:57:59 2004 @@ -165,6 +165,7 @@ User => undef, Password => undef, Base_Location => '.', + Omit_Location => undef, Masquerade => 0, Masquerade_From => '', Masquerade_To => '', @@ -356,6 +357,7 @@ 'r|recursive' => sub { $Opts{Depth} = -1 if $Opts{Depth} == 0; }, 'l|location=s' => \$Opts{Base_Location}, + 'o|omit=s' => \$Opts{Omit_Location}, 'u|user=s' => \$Opts{User}, 'p|password=s' => \$Opts{Password}, 't|timeout=i' => \$Opts{Timeout}, @@ -414,6 +416,8 @@ By default, for example for http://www.w3.org/TR/html4/Overview.html it would be http://www.w3.org/TR/html4/ + -o/--omit regexp Do not check pages whose url matches the perl + regexp. -n/--noacclanguage Do not send an Accept-Language header. -L/--languages Languages accepted$langs. -q/--quiet No output if no errors are found. Implies -s. @@ -792,6 +796,8 @@ return undef if ($current eq $rel); # Relative path not possible? return undef if ($rel =~ m|^(\.\.)?/|); # Relative path starts with ../ or /? + return undef if (defined($Opts{Omit_Location}) + && ($current =~ m/$Opts{Omit_Location}/)); return 1; } @@ -2165,6 +2171,11 @@ L<http://www.w3.org/TR/html4/Overview.html> for example, it would be L<http://www.w3.org/TR/html4/>. +=item B<-o, --omit regexp> + +Perl regexp for URLs of documents that should not be checked, even +if they would otherwise be within scope. + =item B<-n, --noacclanguage> Do not send an Accept-Language header. Diff finished at Sun Feb 8 09:10:12
Received on Sunday, 8 February 2004 15:13:52 UTC