- From: Ville Skyttä <ville.skytta@iki.fi>
- Date: Tue, 24 Jul 2007 18:42:29 +0300
- To: www-validator@w3.org
On Tuesday 24 July 2007, CLOSE Dave wrote: > I'm trying to run checklink against an internal web site with several > thousand pages. Every page includes a pair of links to outside URLs that > cannot be accessed except through the company proxy. However, if the > proxy is enabled, the internal pages cannot be accessed. So, when I run > checklink with the proxy disabled, I get two errors for every page (and > the process takes much longer than it should). Have you tried setting the no_proxy environment variable? It takes a comma separated list of domains for which proxy should not be used. Something like this in the environment could work: http_proxy=http://your.proxy.server/ https_proxy=http://your.proxy.server/ ftp_proxy=http://your.proxy.server/ no_proxy=your.intranet.domain See the LWP::UserAgent documentation for env_proxy() for more information. http://search.cpan.org/dist/libwww-perl/lib/LWP/UserAgent.pm#%24ua-%3Eenv_proxy > I'm looking for a way to specify that some links should not be checked. That has been implemented in the CVS version of the link checker and will be in the next release. The name of the option to do that is --exclude, CVS is available at http://dev.w3.org/cvsweb/perl/modules/W3C/LinkChecker/ > A second issue arises if I try to parse the output of the validator with > all these extraneous errors. The errors themselves are reported on > separate lines from the link and page which caused them. There's a related RFE which would require that too filed as http://www.w3.org/Bugs/Public/show_bug.cgi?id=382 , no ETA for implementation at the moment. > Using grep to > find the errors doesn't reveal the source of the problems. grep's -A and -B options could help a bit with that.
Received on Tuesday, 24 July 2007 15:45:00 UTC