checklink: ideas for next version

Hi Ville, Michael, all.

Somewhere on my pile of “things I want to get done” there is a release  
of the link checker. Quite a few bugs have been fixed, and there was  
work on the UI, but I have been a little frustrated because some of  
the complaints about the link checker remained untouched.

I think the #1 complaint is "it's slow", which we know comes from the  
fact that the library used to make the link checker a good web  
citizen, respecting robots.txt and all, has to wait for at least 1  
second between requests, and we haven't managed to make it interact  
with the parallel agent module yet. But that's all wonkish excuses and  
it doesn't solve our problem: the (perceived) slowness.

I wanted to get back to an experiment which Ville had worked on a  
while ago:
http://qa-dev.w3.org/~ville/ack/?url=http%3A%2F%2Fqa-dev.w3.org

I liked it quite a lot, but stumbled on the fact that it may not be  
very accessible, and that it forfeited a lot of the features of the  
existing link checker. A couple of days ago, I started looking at  
whether/how we could have the best of both worlds:

* keep most of the current checklink architecture, including the  
ability to use as standalone, commandline tool
* during link checking process, use DOM scripting to populate a table  
listing the links that are being checked, and the results, in real time.
* keep the summary output in "plain" HTML, accessible and all.
* the real time display of links being checked should help fight the  
perceived slowness (stuff happens on the page)


How to do that:

* code a js function to create an HTML id based on two URIs (the URI  
of the document checked and the URI of the link checked). That will be  
useful to identify table rows in the DOM

* add a js function to add a row to the DOM with a given ID

* add a js function to display the link in a given table cell. This  
function will be called via a <script> output while a document is  
being parsed and links-to-be-checked are discovered

* add a js function to display the result in a table cell. This  
function will be called via a <script> output in real time as links  
get checked. The function will also apply styles on the fly to the  
table row or cell

Sounds easy enough. Worth a try? Anyone interested in working on this  
with me?

olivier
-- 
olivier Thereaux - W3C - http://www.w3.org/People/olivier
W3C Open Source Software : http://www.w3.org/Status

Received on Tuesday, 16 December 2008 17:14:56 UTC