Perl script to print out TITLE lines

          

Problem:

   I wanted to produce a listing of TITLE lines for all
   HTML files in my current directory. 

Solution:

   A perl script to do the job, called htmrpt


#!/usr/local/bin/perl
#
# give a report of the TITLE lines from all *.htm files

foreach $file (<*.htm*>) {
  printf "%-20s ",  $file;
  open(HTFILE, $file);
  while (<HTFILE>) {
    chop;
    $notitle = 1;
    if (/^\<TITLE\>/) {
      s/<TITLE>//;
      s/<\/TITLE>//;
      print;
      print "\n";
      $notitle = 0;
      last;    # break out of this loop (file)
    }
    if (/^\<title\>/) {          
      s/<title>//;       # There must be an easier way to
      s/<\/title>//;    # solve this almost duplicate block of code!
      print;            # I am just a perl newbie
      print "\n";
      $notitle = 0;
      last;    # break out of this loop (file)
    }
  }
  if ($notitle == 1) { print "*** No title found ***\n" };
  close(HTFILE);
}


The following script htmrpt2.pl prints the Title Line first
so the script should be piped into sort for an alphabetical
listing by Title


#!/usr/local/bin/perl

foreach $file (<*.htm*>) {
  open(HTFILE, $file);
  while (<HTFILE>) {
    chop;
    $notitle = 1;
    if (/^\<TITLE\>/) {
      s/<TITLE>//;
      s/<\/TITLE>//;
      print;
      $notitle = 0;
      last;    # break out of this loop (file)
    }
    if (/^\<title\>/) {
      s/<title>//;
      s/<\/title>//;
      print;
      $notitle = 0;
      last;    # break out of this loop (file)
    }
  }
  if ($notitle == 1) { print "*** No title found ***  " };
  close(HTFILE);
  print "  ($file)\n";
}




---------------------------------------------------------------------------
    Charles Cave             ~  .-_|\        Phone +61 2 416 6877
Customer Services Manager      /     \ ~     Fax   +61 2 416 2086
   Unidata Australasia         \.--._*<---   Level 2, 280 Pacific Hwy
cmc@sydney.unidata.oz.au      ~     v        Lindfield, NSW, 2070 AUSTRALIA
---------------------------------------------------------------------------

Received on Tuesday, 30 May 1995 04:02:01 UTC