[Bug 893] cache (non) existence

http://www.w3.org/Bugs/Public/show_bug.cgi?id=893

ville.skytta@iki.fi changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |ASSIGNED



------- Additional Comments From ville.skytta@iki.fi  2004-10-11 22:13 -------
Right, the /robots.txt fetches should be cached, and actually as far as the low
level implementation (LWP::RobotUA) is concerned, they _are_ cached.

But in the current version of the current link checker codebase, we're
instantiating several W3C::UserAgent (a superclass of LWP::RobotUA) objects per
link checker run, and the /robots.txt information cache is not shared between
these instances by default; instead, every one of them maintains its own small
cache, practically resulting in very little caching, if at all :(

The real fix would be to instantiate exactly one W3C::UserAgent per link checker
run and use that for fetching all links (unless we want to do parallel fetching
sometime), but that is a very intrusive change and will most likely have to wait
until the next major link checker version.

However, I believe it is possible to come up with an interim solution by
managing a "global" WWW::RobotRules object ourselves and passing that to all
instantiated UserAgents.  I'll look into it.



------- You are receiving this mail because: -------
You are the QA contact for the bug, or are watching the QA contact.

Received on Monday, 11 October 2004 22:13:51 UTC