W3C home > Mailing lists > Public > www-lib@w3.org > January to March 2000

webbot broken pipe

From: Q. Alex Zhao <azhao@cc.gatech.edu>
Date: Tue Jan 4 23:38:12 2000
Message-Id: <200001050438.XAA03187@gvu2.cc.gatech.edu>
To: www-lib@w3.org
Hi,

I'm trying to generate a list of all images on a web site using webbot, but
got a broken pipe signal and the program exited with code 141. The shell
script I was using looked like the following:

    listraw=/tmp/.makecollage.raw
    listerrs=/tmp/.makecollage.err
    listfile=/tmp/.makecollage.lst
    liststamp=/tmp/.makecollage.stp
    URL='http://www.cc.gatech.edu/gvu/'

    webbot -q -n -ss -cache -cache_size 64	\
    -exclude '^ftp:|/ai\.old/|/cogsci\.old/|/classes/|/data_files/|/faculty/|/ftp/|/general_images/|/gvu-old/|/home/|/linux/|/newhome/|/newvis/|/shortcut/|/space/|/tech_reports/|/ugrads/|/user_surveys/|\.gz$|\.tar$|\.tgz$|\.Z$|\.zip$|\.ZIP$|\.exe$|\.EXE$|\.ps$|\.PS$|\.doc$|\.DOC$|\.pdf$|\.PDF$|\.xplot$|\.java$|\.c$|\.h$|\.txt$|\.ppt$|\.PPT$'	\
    -check '\.gif$|\.GIF$|\.png$|\.PNG$|\.jpeg$|\.JPEG$|\.jpg$|\.JPG$' \
    -prefix "$URL" -depth 256	\
    -imgprefix "$URL" -img	\
    -redir -l $listraw "$URL" > $listerrs 2>&1
    result=$?

I got the Jan 4 libwww snapshot. How do I prevent this problem? Has anybody
else seen this problem? Anybody tried anything similar to what I'm doing on
a large web site?

Thanks.

Cheers.
= Q. Alex Zhao
  http://www.cc.gatech.edu/~qiang.a.zhao/
  mailto:azhao@cc.gatech.edu voiceto:404-894-9390 faxto:404-385-1253
  Graphics, Visualization & Usability Center, Georgia Inst. of Tech.
Received on Tuesday, 4 January 2000 23:38:12 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 23 April 2007 18:18:35 GMT