W3C home > Mailing lists > Public > public-qa-dev@w3.org > August 2005

Re: Making Linkchecker working parallel

From: olivier Thereaux <ot@w3.org>
Date: Wed, 31 Aug 2005 19:06:41 +0900
Message-Id: <31C9CA1F-885D-4928-A788-4FB08056BE95@w3.org>
Cc: Dominique HazaŽl-Massieux <dom@w3.org>
To: QA Dev <public-qa-dev@w3.org>


On 20 Jul 2005, at 21:18, olivier Thereaux wrote:
>
> The documentation for LWP::Parallel::* is a bit scattered, so after  
> giving it a further look, I found that there is a "middle ground"  
> between the rather awkward chunk-based callback subroutine, and the  
> slow-return batch wait().
>
>  http://search.cpan.org/~marclang/ParallelUserAgent-2.57/lib/LWP/ 
> Parallel.pm has:
>   # on_return gets called whenever a connection (or its callback)

What we have in CVS now is an (almost) functional solution based on  
wait(). Almost, because:

   $ua->register($request);
   my $entries = $ua->wait();

Works well as long as you don't get a 401. You get a bunch of  
HTTP::Response objects as a result, from which you can get your  
original request back (and therefore the URL you tried to check in  
the first place) with something like:

   foreach (keys %$entries) {
   my  $uri = $response->request->url();
   }

Except that this doesn't work in the case of a 401. Not sure why (a  
bug in LWP::Parallel::??) but $response->request is undefined in that  
case. So we're pretty much stuck with using on_return anyway.

...Which leads me to wonder where we're to store the info from  
on_return. At the moment the realms and results are passed around as  
hashes and arrays, but it would probably better (and that's one goal  
of m12n, I suppose) to have them stored in an object. And my question  
is, are we going to use another class for that, or is it reasonable  
to stuff this into W3C::UserAgent ?

Ville, any idea?
-- 
olivier
Received on Wednesday, 31 August 2005 10:06:52 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 19 August 2010 18:12:45 GMT