- From: Mark Stanton <mark@gruden.com>
- Date: Fri, 20 Jun 2003 10:03:37 +1000
- To: "Html-Tidy@W3. Org" <html-tidy@w3.org>
- Cc: "Ian Stacey" <ian.stacey@orange.net>
Hey Ian Yeah, I've put together a little spider + validator (using lynx, libwww & tidy) for testing new sites. Its pretty rough at the moment but does the job for our in house stuff (picks up dead links, dodgy mark-up, etc...). I'm spidering the site for links with libwww and building a site map, then grabbing the actual html with lynx before running the files through tidy. No files are modified I just print a tidy report to the screen. I'm calling tidy with a straight cfexecute: <cfexecute name="#tidyExePath##tidyExeFile# -access 0 -f #tidyCachePath#\#name#.log -config #tidyExePath##tidyCfgFile# #lynxCachePath#\#name#" timeout="5"></cfexecute> cfexecute is kind of dodgy in CFMX (timeouts don't work properly) so I'm going to be looking at doing it at a java level shortly (with jtidy). I know of another company who has this set up and it seems to work for them. If you like I can shoot through the spider code, but be warned its fairly rough round the edges. Cheers Mark ______________ Mark Stanton Web Production Gruden Pty Ltd Tel: 9956 6388 Mob: 0410 458 201 Fax: 9956 8433 www.gruden.com
Received on Thursday, 19 June 2003 20:09:13 UTC