- From: Joel Schroeder <jschroeder@freehandsystems.com>
- Date: Thu, 20 Feb 2003 13:13:34 -0800
- To: www-validator@w3.org
I am using checklink.pl from the command line and I've found that
whether it finds bad links can depend on the order the HTML files are
listed as arguments. (It may be indicative of a more significant
shortcoming.) Although I haven't looked at the code, I'm guessing the
behavior has to do with the way checklink seems to avoid re-parsing
files unnecessarily.
I've made a toy example to demonstrate the behavior. Notice below that
lines [3] and [4] give identical results, as I would expect. However,
lines [5] and [6] give different results, although they differ only in
the order in which the HTML files are passed to checklink.
I hope this is worth your time,
Joel Schroeder
============BEGIN EXAMPLE============
[0]$ ls
anchor.htm checklink.pl link_name.htm link_no_name.htm
[1]$ diff link_no_name.htm link_name.htm
5c5
< <A href="anchor.htm"></A>
---
> <A href="anchor.htm#AAA"></A>
[2]$ cat anchor.htm
<HTML>
<HEAD></HEAD>
<BODY>
<A name="AAA"></A>
<IMG src="a.png">
</BODY>
</HTML>
[3]$ ./checklink.pl link_no_name.htm anchor.htm | grep "Fix"
To do: The link is broken. Fix it NOW!
[4]$ ./checklink.pl anchor.htm link_no_name.htm | grep "Fix"
To do: The link is broken. Fix it NOW!
[5]$ ./checklink.pl link_name.htm anchor.htm | grep "Fix"
[6]$ ./checklink.pl anchor.htm link_name.htm | grep "Fix"
To do: The link is broken. Fix it NOW!
[7]$ cat link_no_name.htm
<HTML>
<HEAD></HEAD>
<BODY>
<A href="anchor.htm"></A>
</BODY>
</HTML>
[8]$ cat link_name.htm
<HTML>
<HEAD></HEAD>
<BODY>
<A href="anchor.htm#AAA"></A>
</BODY>
</HTML>
[9]$
============END EXAMPLE============
Received on Thursday, 20 February 2003 16:14:01 UTC