- From: Charles Cave <cmc@sydney.unidata.oz.au>
- Date: Tue, 30 May 95 18:08:04 +1000
- To: www-html@www10.w3.org
I wanted to write a perl script to show all the HREF references in a document, mainly to assist me with some web design. I have only tested it on my *.htm files, so it is far from rigourously tested! Also, I solved the problem of ignoring the case of the search string...just add the letter i to the command. #!/usr/local/bin/perl foreach $file (<*.htm*>) { print "$file "; open(HTFILE, $file); $notitle = 1; while (<HTFILE>) { chop; if (/^\<TITLE\>/i) { s/<TITLE>//; s/<\/TITLE>//; print "($_)\n"; $notitle = 0; } if (/HREF=.*>/i) { $line = $_; $line =~ s/href/HREF/; $where = index($line, "HREF"); $part1 = substr($line, ($where+4), 99); $rapos = index($part1,">"); $part2 = substr($part1, 1, ($rapos-1)); print " --> $part2\n"; } } if ($notitle == 1) { print "\n" }; close(HTFILE); } Sample output (heavily edited) creative.htm (<title>Creativity Home Page</title>) --> "charter.htm" --> "whatsnew.htm" --> "cgroup1.htm" --> "cgroup2.htm" --> "cgroup3.htm" --> "suncrfaq.htm" --> "crquote.htm" --> "craffirm.htm" --> "famous.htm" --> "crbrain.htm" --> "cgroup8.htm" --> "crcredit.htm" --> "mailto:charles@jolt.mpx.com.au" crebooks.htm (Books related to Creativity) --> "mailto:charles@jolt.mpx.com.au" --> "AAdams.htm" --> B13159.htm crsware.htm (Creativity Software) --> #brainstorm --> #ckm --> #cm1 --> #cork --> #gcope --> #gsys --> #ide --> #ideplus --> #ifisher sylvan.htm (CREATIVITY</HEAD>) --> "http://www.shore.net/~quantum/" --> "#8" --> "#15" --> "#18" --> "#7" --> "#10" --> "#9" --> "#12" --> "#14" --> "#16" --> "#20" --> "#10" --> "#6" --> "#7" --> "#2" --> "#5" --> "#15" --> "#6" --> "#3" --> "#16" --> "#17" --> "mailto:psylvan@world.std.com" whatsnew.htm (What's new on the Creativity Web Pages) --> "crbrain.htm" --> "crebooks.htm" --> "creative.htm" --------------------------------------------------------------------------- Charles Cave ~ .-_|\ Phone +61 2 416 6877 Customer Services Manager / \ ~ Fax +61 2 416 2086 Unidata Australasia \.--._*<--- Level 2, 280 Pacific Hwy cmc@sydney.unidata.oz.au ~ v Lindfield, NSW, 2070 AUSTRALIA ---------------------------------------------------------------------------
Received on Tuesday, 30 May 1995 05:01:37 UTC