Re: checklink: from Ville Skyttä on 2006-06-28 (www-validator@w3.org from June 2006)

From: Ville Skyttä <ville.skytta@iki.fi>
Date: Wed, 28 Jun 2006 19:44:20 +0300
To: "Browning, Glen J ERDC-ITL-MS Contractor" <Glen.J.Browning@erdc.usace.army.mil>
Cc: www-validator@w3.org
Message-Id: <1151513060.6570.77.camel@localhost.localdomain>

On Wed, 2006-06-28 at 10:49 -0500, Browning, Glen J ERDC-ITL-MS
Contractor wrote:
> I noticed recently while running a link check on my page, that it
> would frequently loop through sections of my site that it had already
> checked simply because it encountered another link to that section on
> a subsequent page.  As a result I received many pages of redundant
> output.  Things might run much more quickly if it kept track of those
> links it had already followed/checked and didn’t do them again.

It does exactly that and doesn't check already visited links again [0],
but like you noticed, it doesn't filter multiple instances of the same
link from the output either.

[0] Except in cases where it had previously checked the link using the
    HEAD method and later encounters a link to the same document 
    which contains a fragment identifier; then it kind of checks it 
    again, retrieving the document using the GET method this time in 
    order to be able to check that those fragment identifiers point to
    existing "anchors" in the doc.

Received on Wednesday, 28 June 2006 16:44:28 UTC