[Bug 4985] Link checker dies on links to particular url

http://www.w3.org/Bugs/Public/show_bug.cgi?id=4985


ville.skytta@iki.fi changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ot@w3.org




------- Comment #1 from ville.skytta@iki.fi  2007-09-02 08:49 -------
I can reproduce this locally - the error I get in the Apache error log is:

[Sun Sep 02 10:56:13 2007] [warn] [client 127.0.0.1] Timeout waiting for output
from CGI script /home/scop/cvs/w3c/LinkChecker/bin/checklink, referer:
http://lo
calhost/checklink

However, I'm not sure what to do about this - it's Apache which gives up
waiting for output from checklink, not something that is strictly speaking
completely checklink's fault or under its control in my opinion.

I've moved output of the initial HTTP headers so that they are written before
the first document is fetched, but if the timeout happens and Apache gives up
on our CGI, we'll still get missing results and no sane error message (or
incomplete results if it kicks in later during the check) and invalid markup.

One thing worth looking into would be to decrease link checker's timeout to
something smaller - Apache defaults to 300 seconds (but the default in my
Fedora setup's httpd.conf is 120 seconds) and the link checker uses 60 seconds
by default.  That doesn't explain why I see the timeout, but for example on
qa-dev.w3.org the link checker reports a better error message which means
Apache didn't kill it:

http://qa-dev.w3.org/wlc/checklink?uri=http%3A%2F%2Ffastcounter.bcentral.com%2Ffc-join&hide_type=all&depth=&check=Check
Error: 500 Can't connect to fastcounter.bcentral.com:80 (connect: timeout)

Olivier, could you check what the httpd timeout is set to on validator.w3.org
Apaches?  Other ideas?

Received on Sunday, 2 September 2007 08:49:52 UTC