Re: checklink: [SEC=UNOFFICIAL]

On 12 Oct 2011, at 01:56, Chia, Dave wrote:
> The link checker tends to use and check search functions, and comment functions when they are available on a website.

"Use"? Has the link checker acquired the ability to fill out forms while I wasn't looking? Or do you just mean "follow links that happen to go to search result pages".

> This adds to a great deal of unnecessary checks on irrelevant pages. Shouldn’t the link checker identify the ‘real’ pages and just check the links on those pages?


Determining what links are "relevant" is a very difficult problem. First you would have to decide what consisted a relevant link (opinions WILL differ), then come up with some kind of heuristic  algorithm to determine which links went somewhere relevant and which did not.

The program does have the --exclude-docs switch, which lets you specify a regular expression that matches URLs you don't want to check, so authors testing their sites can exclude comment and search pages so long as they have a semi-sane URI structure.

-- 
David Dorward
http://dorward.me.uk

Received on Wednesday, 12 October 2011 13:56:40 UTC