- From: Joel Young <jdy@godel.cs.brown.edu>
- Date: Fri, 15 Jun 2001 19:52:38 -0400
- To: www-lib@w3.org
- cc: jdy@cs.brown.edu
Since I seem to be talking to myself on this list, let me continue the conversation by myself. Yes, Joel, that is a pretty clever adaptation of the other guys work, but you need to be careful with what you are doing. Are you sure you want to add that conversion everytime you make a request? Did you remember to remove it when it isn't needed? Probably not. Well if you want to just do it for a particular request you could simply do: HTList* mylist = HTList_new(); HTConversion_add( mylist, "text/html", "www/present", (HTConverter*) &HTMLPresentAndChunk, 1.0, 0.0, 0.0); HTRequest_setConversion(request, mylist, YES); instead of adding the conversion directly to the HTFormat_conversion list. Seems to work a little better. **** It still gets -902 interrupted errors on long pages. The page seems to parse just fine and be complete in the chunk but even so the error arises. Note that it doesn't arise just using the HText callbacks without the above conversion. As libwww doesn't seem to be actively maintained, nor do questions get answered much on this list, maybe it is time to give libcurl/libghttp and libxml2 a try. They seem quite functional and not quite so overly recherche (1970 I. Murdoch sense). Joel -------- From: Joel Young <jdy@godel.cs.brown.edu> Date: Fri, 25 May 2001 17:14:36 -0400 To: www-lib@w3.org Cc: jdy@cs.brown.edu Subj: Re: Getting both a chunk and HText callbacks I figured out a hack using a hint from the archive: http://lists.w3.org/Archives/Public/www-lib/msg00377.html By adding a pointer to "chunk" as an element of the request context, the following "converter tee" (Maciej Puzio) does the trick. ////// static HTStream* HTMLPresentAndChunk ( HTRequest* request, void* param, HTFormat input_format, HTFormat output_format, HTStream* output_stream) { requestcontext_t* context = reinterpret_cast<requestcontext_t*>(HTRequest_context(request)); return HTTee( HTMLPresent(request,param,input_format,output_format,output_stream), HTStreamToChunk(request,&context->chunk,-1),0); } ////// When combined with ////// HTRequest* request = HTRequest_new(); requestcontext_t* context = new requestcontext_t(this); HTRequest_setContext(request,context); HTNet_addAfter(&term_handler, 0, 0, HT_ALL, HT_FILTER_LAST); HText_registerCDCallback(&RHText_new,&RHText_delete); HText_registerTextCallback(&add_text); HText_registerLinkCallback(&found_link); HTAlert_setInteractive(NO); HTHost_setEventTimeout(15000); // if can't load 15 secs, abort HTAnchor* anchor = HTAnchor_findAddress(url); HTConversion_add( HTFormat_conversion(), "text/html", "www/present", (HTConverter*)&HTMLPresentAndChunk, 1.0, 0.0, 0.0); HTLoadAnchor(anchor, request); HTEventList_newLoop(); ////// and now context->chunk contains the page. Does this make sense? Why doesn't HTRequest_conversion(request) work instead of the HTFormat_conversion()? Why doesn't RHText_delete ever get called? Since the "system" doesn't call it, where should I call it from? How does one know that that is the correct place to do the HTConversion_add? Is there a better place? I am still missing the "big picture" with this library. Thanks, Joel jdy@cs.brown.edu -------- From: Joel Young <jdy@godel.cs.brown.edu> Date: Fri, 25 May 2001 15:30:24 -0400 To: www-lib@w3.org Cc: jdy@cs.brown.edu Subj: Getting both a chunk and HText callbacks I am trying to use HTTee to get libwww to simultaneously from one HTLoad to load the webpage into a chunk and also to call my HText callbacks. Here is the code I am using: HTRequest* request = HTRequest_new(); HTNet_addAfter(&term_handler, 0, 0, HT_ALL, HT_FILTER_LAST); HTAlert_setInteractive(NO); HTHost_setEventTimeout(15000); // if can't load 15 secs, abort HTAnchor* anchor = HTAnchor_findAddress(url); HTRequest_setAnchor(request, anchor); HTChunk* chunkchunk = 0; // HTRequest_setOutputFormat(request,WWW_SOURCE); HTStream* chunkstream = HTStreamToChunk(request,&chunkchunk,-1); HText_registerCDCallback(&RHText_new,&RHText_delete); HText_registerTextCallback(&add_text); HText_registerLinkCallback(&found_link); HTStream* target = HTTee(chunkstream, HTRequest_outputStream(request),0); HTRequest_setOutputStream(request, target); HTLoad(request, NO); HTEventList_newLoop(); std::cerr << HTChunk_size(chunkchunk) << std::endl; char* strchunk = HTChunk_toCString(chunkchunk); std::cerr << strchunk << std::endl; Depending on if the HTRequest_setOutputFormat line is commented or not I either get the HText callbacks or I get the chunk, but I can't seem to get both. Any suggestions? Joel jdy@cs.brown.edu
Received on Friday, 15 June 2001 19:52:39 UTC