- From: Luc Van Eycken <Luc.VanEycken@esat.kuleuven.ac.be>
- Date: Mon, 19 Aug 2002 11:52:15 -0400 (EDT)
- To: www-validator@w3.org
The current checklink.pl forgets during recursive checking to check the links of the pages that were previously searched for anchors only. To reproduce the problem, create the following two files: test.html <html><head><title>Test</title></head> <body><p>Referring to <a href="test1.html#A">item A</a> on <a href="test1.html">test1</a> page.</body></html> test1.html <html><head><title>Test1</title></head> <body><p><a name="A">Item A</a>: <a href="NonExisting.html">invalid link</a>.</body></html> Then "checklink -r http://.../test.html" and discover that the missing link to the NonExisting.html document is not mentioned. By the way, I am using checklink.pl,v 2.89.2.1 2002/07/07 21:54:55. I can solve the problem with the hack given below, but I think that this can not qualify as a proper patch. Best regards, Luc Van Eycken --- checklink.pl.orig 2002-08-14 10:47:54.000000000 +0200 +++ checklink.pl 2002-08-19 17:38:55.000000000 +0200 @@ -846,7 +846,9 @@ my $p; - if (defined($results{$uri}{parsing})) { + if (defined($results{$uri}{parsing}) + && (defined($results{$uri}{parsing}{Links}) + || !($links || $rec_needs_links))) { # We have already done the job. Woohoo! $p->{base} = $results{$uri}{parsing}{base}; $p->{Anchors} = $results{$uri}{parsing}{Anchors}; @@ -879,6 +881,13 @@ $p->parse($document); + # Make sure we know that links are searched for: + # create an empty Links hash + if (!$p->{only_anchors} && !defined($p->{Links})) { + $p->{Links} = {'1' => 2}; + delete $p->{Links}{'1'}; + } + if (! $_summary) { my $stop = &get_timestamp(); if ($_progress) {
Received on Monday, 19 August 2002 11:55:50 UTC