The current checklink.pl forgets during recursive checking to check the links of the pages that were previously searched for anchors only. To reproduce the problem, create the following two files: test.html <html><head><title>Test</title></head> <body><p>Referring to <a href="test1.html#A">item A</a> on <a href="test1.html">test1</a> page.</body></html> test1.html <html><head><title>Test1</title></head> <body><p><a name="A">Item A</a>: <a href="NonExisting.html">invalid link</a>.</body></html> Then "checklink -r http://.../test.html" and discover that the missing link to the NonExisting.html document is not mentioned. By the way, I am using checklink.pl,v 2.89.2.1 2002/07/07 21:54:55. I can solve the problem with the hack given below, but I think that this can not qualify as a proper patch. Best regards, Luc Van Eycken --- checklink.pl.orig 2002-08-14 10:47:54.000000000 +0200 +++ checklink.pl 2002-08-19 17:38:55.000000000 +0200 @@ -846,7 +846,9 @@ my $p; - if (defined($results{$uri}{parsing})) { + if (defined($results{$uri}{parsing}) + && (defined($results{$uri}{parsing}{Links}) + || !($links || $rec_needs_links))) { # We have already done the job. Woohoo! $p->{base} = $results{$uri}{parsing}{base}; $p->{Anchors} = $results{$uri}{parsing}{Anchors}; @@ -879,6 +881,13 @@ $p->parse($document); + # Make sure we know that links are searched for: + # create an empty Links hash + if (!$p->{only_anchors} && !defined($p->{Links})) { + $p->{Links} = {'1' => 2}; + delete $p->{Links}{'1'}; + } + if (! $_summary) { my $stop = &get_timestamp(); if ($_progress) {Received on Monday, 19 August 2002 11:55:50 UTC
This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:58:29 UTC