- From: <bugzilla@wiggum.w3.org>
- Date: Mon, 11 Oct 2004 22:13:49 +0000
- To: www-validator-cvs@w3.org
http://www.w3.org/Bugs/Public/show_bug.cgi?id=893
ville.skytta@iki.fi changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |ASSIGNED
------- Additional Comments From ville.skytta@iki.fi 2004-10-11 22:13 -------
Right, the /robots.txt fetches should be cached, and actually as far as the low
level implementation (LWP::RobotUA) is concerned, they _are_ cached.
But in the current version of the current link checker codebase, we're
instantiating several W3C::UserAgent (a superclass of LWP::RobotUA) objects per
link checker run, and the /robots.txt information cache is not shared between
these instances by default; instead, every one of them maintains its own small
cache, practically resulting in very little caching, if at all :(
The real fix would be to instantiate exactly one W3C::UserAgent per link checker
run and use that for fetching all links (unless we want to do parallel fetching
sometime), but that is a very intrusive change and will most likely have to wait
until the next major link checker version.
However, I believe it is possible to come up with an interim solution by
managing a "global" WWW::RobotRules object ourselves and passing that to all
instantiated UserAgents. I'll look into it.
------- You are receiving this mail because: -------
You are the QA contact for the bug, or are watching the QA contact.
Received on Monday, 11 October 2004 22:13:51 UTC