- From: olivier Thereaux <ot@w3.org>
- Date: Fri, 2 Jan 2009 16:28:11 -0500
- To: Ville Skyttä <ville.skytta@iki.fi>
- Cc: www-validator@w3.org
Hi Ville, Hi all. On 1-Jan-09, at 4:53 PM, Ville Skyttä wrote: > Anyway, I'm cutting a lot of your plan here with just saying it > sounds fine to > me. However, I think the overdue version 4.4 should really be > released > before going wild with implementing the new plans in CVS (or > alternatively, > do the new work on a branch). I'm not aware of any release blockers > at the > moment, do you remember any? No big blocker as far as I can remember, and seeing as we don't yet have anything in CVS for the new ideas, now seems like a good time for a release. I'd like to take a couple hours next week to go through some testing and a run through bugzilla, and perhaps make a couple of minor UI tweaks (plus adding a mention of the validators sponsorship/ donation program) and we're good to go. http://rt.cpan.org/Public/Dist/Display.html?Name=W3C-LinkChecker http://www.w3.org/Bugs/Public/buglist.cgi?query_format=advanced&product=LinkChecker&order=bugs.bug_status > Regarding link checker future, I'd personally actually prefer > redesigning/rewriting much of the current code rather than > continuing too > long with the current implementation. The script is quite a monster > already > and requires quite intimate knowledge to maintain/contribute to - > cleaner > codebase and proper separation of concerns would make many things > much easier > and could attract more contributors. And perhaps while at it, > consider > changing the implementation to e.g. Java or Python. Refactoring would obviously be good, and we have indeed talked before of making a better usage of modular code (which would result in refactoring). Changing language is an interesting question. On the one hand perl is not necessarily the most popular language in the block today, and switching to php or python or ruby would perhaps be a better incentive for today's web developers to participate in the project. On the other hand, the link checker does rely a fair bit on a number of perl libraries, so any change should make sure we wouldn't have to reimplement all those. I know python would fare decently there, with urllib(2) for fetching, beautifulsoup or html5lib for parsing, and robotparser (or Philip's rewrite - http://nikitathespider.com/python/rerp/ ) for the robots.txt part. On the other-other hand (if you are an octopus) we are limited in some ways by those perl libraries: the RobotUA module is why we can't have wait time under 1s between links, integration of parallelUA has been a hurdle noone passed, etc. Switching to another language might get us out of these issues (and create others). -- olivier
Received on Friday, 2 January 2009 21:28:21 UTC