forwarded message from emv@cic.net

Jean-Francois Groff (jfg@bernd.cern.ch)
Tue, 4 Feb 92 10:07:25 -2300


Date: Tue, 4 Feb 92 10:07:25 -2300
From: jfg@bernd.cern.ch (Jean-Francois Groff)
Message-Id: <9202050907.AA12890@bernd.cern.ch>
To: www-talk@nxoc01.cern.ch
Subject: forwarded message from emv@cic.net

------- Start of forwarded message -------
Received: from cernvax.cern.ch by bernd.cern.ch (AIX 3.1/UCB 5.61/4.03)
          id AA13075; Tue, 4 Feb 92 00:59:29 -2300
Received: by cernvax.cern.ch (5.57/Ultrix2.0-B)
	id AA02837; Tue, 4 Feb 92 00:59:58 +0100
Received: by dxmint.cern.ch (cernvax) (5.57/3.14)
	id AA24681; Tue, 4 Feb 92 00:59:56 +0100
Received: from nic.cic.net by quake.think.com (4.1/SMI-4.0)
	id AA02414; Mon, 3 Feb 92 15:42:59 PST
Received: by nic.cic.net (4.1/SMI-4.1)
	id AA09725; Mon, 3 Feb 92 18:41:34 EST
Message-Id: <9202032341.AA09725@nic.cic.net>
In-Reply-To: Your message of "Mon, 20 Jan 92 10:19:51 +0100."
             <9201200919.AA05201@ nxoc01.cern.ch > 
From: emv@cic.net
To: timbl@nxoc01.cern.ch (Tim Berners-Lee)
Cc: jfg@cernvax.cern.ch, gopher@boombox.micro.umn.edu,
        wais-talk@quake.think.com
Subject: Re: using WWW to follow gopher links 
Date: Mon, 03 Feb 92 18:41:32 -0500

Tim,

Some more results of wais/www/gopher collaboration.

I have a new WAIS server running at wais.cic.net, called
"midwest-weather".  It's fed by loading in a bunch of weather reports
from a gopher at Minnesota every hour.  That system gets them from the
"weather underground" at Michigan using some hairy expect scripts, I
figured it'd be easier to get things out of gopher instead.

The script looks like:

WEATHER=gopher://mermaid.micro.umn.edu:150/00/Weather
www -n -np ${WEATHER}/Indiana/Fort%20Wayne | sed -e 's/.$//' > fort-wayne.in
www -n -np ${WEATHER}/Indiana/Indianapolis | sed -e 's/.$//' > indianapolis.in
www -n -np ${WEATHER}/Indiana/South%20Bend | sed -e 's/.$//' > south-bend.in
[...]

For some reason the gopher files are coming out of www with extra ^M's
on the end, as if they were DOS files; so the sed thing gets rid of them.

I don't see a way to do this with just one invocation of www, so
instead it runs once for each file.

Neither gopher nor WWW have the notion of a "recursive directory
listing", either some complete overview of the structure of the system
or some skeleton outline.  (I realize it's arbitrarily hard to do so
since any link could point off anywhere else.)  That makes it tougher
to do an archie-style catalog.  I think it wouldn't be that hard to
build a tree-walker for gopher that prints out a list of the
directories on every system that it can find and also the text of all
of the stuff that's in the ".about" directories.  At the very least
I'm doing some of that by hand now (just a script like the one above)
& waising it so I have some clue what all is out there.  *not* a 
replacement for the per-site indexes, but a cross-section.

- --Ed



------- End of forwarded message -------