W3C home > Mailing lists > Public > html-tidy@w3.org > July to September 2000

Re: Is there any way to get TIDY to recursively tidy up a tree of fil es?

From: Sebastian Lange <lange@cyperfection.de>
Date: Thu, 13 Jul 2000 08:50:02 +0200
Message-Id: <4.3.2.7.2.20000713083605.0219a4e0@pop3.cyperfection.de>
To: html-tidy@w3.org
At 20:14 12.07.2000 -0700, RickR@biztro.com wrote:
>

... well, he wrote nothing... ;-) but his question was: Is there any way to 
get TIDY to recursively tidy up a tree of files?

I am currently working on a perl script to convert certain HTML files into 
XML files, it basically is called like "./html2xml.pl 
source_path/to/html_files target_path/to/xml_files". The path to the HTML 
sources is processed recursively, the XML files are written to the target 
directory (without recursivle re-creating the directory structure, but that 
should be a minor modification only).

If I find some time next week, I'll have a go at it... for instance, you 
can take this piece of perl code and modify it to your needs....

push (@DirStack, $ARGV[0]);
$XMLDir = $ARGV[1];


while ($DirCursor = pop @DirStack) {
         opendir TheDir, $DirCursor;
                 @DirContent = readdir TheDir;
                 foreach $Filename (@DirContent) {
                         if (($Filename eq '.') || ($Filename eq '..')) {
                                 next;
                         }
                         $FilePath = $DirCursor.'/'.$Filename;
                         if (-d $FilePath) {
                                 push @DirStack, $FilePath;
                                 next;
                         }
                         if ($FilePath =~ /.html?$/i) {
                                 open (INFILE, "< $FilePath") or 
die("ERROR: \'$FilePath\' not openable for reading!\n");
                                         $FILE = join("", <INFILE>);
                                 close (INFILE);

                                 $FILE = trim(retrieveVariables($FILE));

                                 #printVariablesToScreen();
                                 printXMLFile();
                                 print "Parsed file: " . $FilePath . " -> " 
. $XMLFilename . "\n";
                         }
                 }
         closedir TheDir;



>Rick Roth
>Biztro, Inc.
><mailto:rickr@biztro.com>rickr@biztro.com
>

--
Sebastian Lange
http://www.sl-chat.de/
Maybe the first chat site that validates as HTML
4.0 even though user input may contain HTML codes.

Courtesy to Dave Raggett's HTML Tidy:
http://www.w3.org/People/Raggett/tidy/
Received on Thursday, 13 July 2000 02:53:28 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 3 April 2012 06:13:44 GMT