- From: Joel Schroeder <jschroeder@freehandsystems.com>
- Date: Thu, 20 Feb 2003 13:13:34 -0800
- To: www-validator@w3.org
I am using checklink.pl from the command line and I've found that whether it finds bad links can depend on the order the HTML files are listed as arguments. (It may be indicative of a more significant shortcoming.) Although I haven't looked at the code, I'm guessing the behavior has to do with the way checklink seems to avoid re-parsing files unnecessarily. I've made a toy example to demonstrate the behavior. Notice below that lines [3] and [4] give identical results, as I would expect. However, lines [5] and [6] give different results, although they differ only in the order in which the HTML files are passed to checklink. I hope this is worth your time, Joel Schroeder ============BEGIN EXAMPLE============ [0]$ ls anchor.htm checklink.pl link_name.htm link_no_name.htm [1]$ diff link_no_name.htm link_name.htm 5c5 < <A href="anchor.htm"></A> --- > <A href="anchor.htm#AAA"></A> [2]$ cat anchor.htm <HTML> <HEAD></HEAD> <BODY> <A name="AAA"></A> <IMG src="a.png"> </BODY> </HTML> [3]$ ./checklink.pl link_no_name.htm anchor.htm | grep "Fix" To do: The link is broken. Fix it NOW! [4]$ ./checklink.pl anchor.htm link_no_name.htm | grep "Fix" To do: The link is broken. Fix it NOW! [5]$ ./checklink.pl link_name.htm anchor.htm | grep "Fix" [6]$ ./checklink.pl anchor.htm link_name.htm | grep "Fix" To do: The link is broken. Fix it NOW! [7]$ cat link_no_name.htm <HTML> <HEAD></HEAD> <BODY> <A href="anchor.htm"></A> </BODY> </HTML> [8]$ cat link_name.htm <HTML> <HEAD></HEAD> <BODY> <A href="anchor.htm#AAA"></A> </BODY> </HTML> [9]$ ============END EXAMPLE============
Received on Thursday, 20 February 2003 16:14:01 UTC