- From: <bugzilla@wiggum.w3.org>
- Date: Mon, 11 Oct 2004 22:13:49 +0000
- To: www-validator-cvs@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=893 ville.skytta@iki.fi changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED ------- Additional Comments From ville.skytta@iki.fi 2004-10-11 22:13 ------- Right, the /robots.txt fetches should be cached, and actually as far as the low level implementation (LWP::RobotUA) is concerned, they _are_ cached. But in the current version of the current link checker codebase, we're instantiating several W3C::UserAgent (a superclass of LWP::RobotUA) objects per link checker run, and the /robots.txt information cache is not shared between these instances by default; instead, every one of them maintains its own small cache, practically resulting in very little caching, if at all :( The real fix would be to instantiate exactly one W3C::UserAgent per link checker run and use that for fetching all links (unless we want to do parallel fetching sometime), but that is a very intrusive change and will most likely have to wait until the next major link checker version. However, I believe it is possible to come up with an interim solution by managing a "global" WWW::RobotRules object ourselves and passing that to all instantiated UserAgents. I'll look into it. ------- You are receiving this mail because: ------- You are the QA contact for the bug, or are watching the QA contact.
Received on Monday, 11 October 2004 22:13:51 UTC