- From: Filippo Menczer <fil@cs.ucsd.edu>
- Date: Sat, 30 May 1998 20:47:34 -0400 (EDT)
- To: www-lib@w3.org
- Cc: fil@cs.ucsd.edu
Could someone help me figure out why this simple example is crashing? Basically I am calling the code in the "chunk" example multiple times by wrapping a loop around it. The first time it goes fine, the second causes a segmentation fault. Configuration: Linux (on a Pentium II box) w3c-libwww-5.1m (statically linked) Here is the code: #include "WWWLib.h" #include "WWWHTTP.h" #include "WWWInit.h" void mygetchunk(char *url) { HTRequest *request = HTRequest_new(); HTChunk *chunk = NULL; HTProfile_newPreemptiveClient("TestApp", "1.0"); WWWTRACE = SHOW_CORE_TRACE + SHOW_STREAM_TRACE + SHOW_PROTOCOL_TRACE; HTRequest_setOutputFormat(request, WWW_SOURCE); if (url) { char *cwd = HTGetCurrentDirectoryURL(); char *absolute_url = HTParse(url, cwd, PARSE_ALL); chunk = HTLoadToChunk(absolute_url, request); HT_FREE(absolute_url); HT_FREE(cwd); printf("%s\n", chunk ? "OK-FIRST-TIME" : "NO DATA"); } HTRequest_delete(request); HTProfile_delete(); } main() { mygetchunk("http://www.cs.ucsd.edu/~fil/agents"); mygetchunk("http://www.cs.ucsd.edu/~fil/agents"); /* any URL here */ } Here are output and tail of the trace: % mychunk 2> mychunk.trace OK-FIRST-TIME 30900 Segmentation fault (core dumped) % % tail mychunk.trace Net Object.. 0x80e0af8 created with hash 2 Net Object.. starting request 0x80cf050 (retry=1) with net object 0x80e0af8 HTTP........ Looking for `http://www.cs.ucsd.edu/~fil/agents' HTDoConnect. Looking up `www.cs.ucsd.edu' Host info... REUSING CHANNEL 0x80cf330 Host info... Add Net 0x80e0af8 (request 0x80cf050) to pipe, 2 requests made, 1 requests in pipe, 0 pending HTHost...... No ActivateRequest callback handler registered Channel..... Semaphore increased to 1 for channel 0x80cf330 HTTP........ Force flush on preemptive load StreamStack. Constructing stream stack for text/x-http to */* % And here is the execution stack from gdb: Core was generated by `mychunk'. Program terminated with signal 11, Segmentation fault. 540 while ((pres = (HTPresentation*)HTList_nextObject(cur))) { (gdb) where #0 0x80511bd in HTStreamStack (rep_in=0x80df588, rep_out=0x80c96f0, output_stream=0x80dfca8, request=0x80cf050, guess=1) at HTFormat.c:540 #1 0x806cb0a in HTTPEvent (soc=-1, pVoid=0x80e0b68, type=HTEvent_BEGIN) at HTTP.c:1026 #2 0x806c7c0 in HTLoadHTTP (soc=-1, request=0x80cf050) at HTTP.c:916 #3 0x8056360 in HTNet_newClient (request=0x80cf050) at HTNet.c:732 #4 0x804cc14 in HTLoad (me=0x80cf050, recursive=0 '\000') at HTReqMan.c:1575 #5 0x8048ee7 in launch_request (request=0x80cf050, recursive=0 '\000') at HTAccess.c:75 #6 0x804910b in HTLoadToChunk ( url=0x80dfc80 "http://www.cs.ucsd.edu/~fil/agents", request=0x80cf050) at HTAccess.c:183 #7 0x80481c3 in mygetchunk (url=0x80ae197 "http://www.cs.ucsd.edu/~fil/agents") at mychunk.c:19 #8 0x804824a in main () at mychunk.c:32 #9 0x80480ee in _start () (gdb) print *0x80cf050 $1 = 0 Is this the wrong way to go about it? All I really need from the libwww is a simple way to GET the contents of a long sequence of URLs, each into memory, sequentially, and without any parsing. My own code will process the contents between calls to the library. The code in the command-line-tool and the documentation are beyond my comprehension and the limited scope of my needs. (Ideally I would want to get the contents only if the pages are text/html or text/plain, but that and other things are the next steps.) Any assistance would be greatly appreciated! Please cc: my email as I am not a subscriber of the list. Thanks, -Fil ==================================================== Filippo Menczer http://www.cs.ucsd.edu/~fil/ fil@cs.ucsd.edu CSE Dept., 0114 Lab: (619) 453-4364 U. C. San Diego Fax: (619) 534-7029 La Jolla, CA 92093-0114, USA ====================================================
Received on Saturday, 30 May 1998 23:58:57 UTC