RE: Request for thoughts on libwww

Kerry -

We used libwww for quite a while.  I finally had to remove it, for a variety
of reasons.

These are:

1) Difficult to use.  I found little to no real documentation on how to use
it.  Mostly just examples that sometimes worked, but most of the time were
just slightly not enough.  Every time I had to add functionality it was a
struggle to figure out which calls to make, which objects to use, and
usually ended up in deep searches in the source tree.

2) Pipelining requests.  No one on the list seems to agree with me (well
maybe they do, but no one responded to my posts), but pipelining requests to
a load-balanced web server farm is a *BAD* idea.  Libwww does not seem to
care if you set max pipelined requests to 1 (in fact the code outright
rejects it) and this does not make me feel good right from the start.  In
addition, we decided since we had invested so much time in using libwww that
we could just "fix" this issue.  In short, we did (hacked the code).  But in
the long run we didn't.  There seems to be no capability to say "ok, to this
host, I want to pipeline.  To this other host, I don't".  The host
structures and setup are all pretty murky.

3) Managing and setting up timeouts was a pain.  Based on the code I was
pretty sure I could call the global api to set timeouts at the right time to
make it do the right thing.  But this is pretty weird.  Say you are talking
to 10 hosts.  Doesn't it seem to make sense that you might want a global
default timeout, but then have the ability to modify that on an individual
basis?  Well the only way to do it is to modify the global timeout at *just*
the right time when the host object copies the global var to a local var.
(Hint: you have to do this from a callback)

4) More on the timeouts:  the timeout code for posting data is broken.
Another tweak I had to add was to reset the timeout every time a chunk of
data was transferred.  Otherwise the timeout applies to the overall transfer
time and this is no good if your sets of data are large (well, frankly it's
just plain incorrect.  You don't want to timeout a server after 10 seconds
if every 1 second you have successfully transferred say 10kb)

4) If you start to use libwww for a great length of time you will notice
that it wants to control your program, not the other way around.   It should
not be called libwww, it should be called Appwww.  It is quite literally
it's own application, and the code/logic that you are writing it considers
more or less a plugin.  Fighting this proved to be exhausting, I finally
gave up.

5) We were running in a threaded environment.  Even more difficult to manage
libwww just right, though I do think I did it correctly.  It took a good
solid week to get all the synchronization issues worked out.


On the plus side, libwww does seem to support HTTP 1.1 (byte-ranged gets), a
very nice debug/tracing utility, good error messages, and overall seemed to
work pretty well, once you put in a heap of work to tweak it just right.

In the end, it seemed, it was not a good fit for us.

(FYI, I never used the XML parsing from libwww, I always just took the data
and ran it through libxerces)

Taylor Gautier
Senior Software Engineer
-----Original Message-----
From: Kerry DeZell [mailto:kdezell@bluemountain.com]
Sent: Tuesday, August 01, 2000 8:34 PM
To: www-lib@w3.org
Subject: Request for thoughts on libwww


Hello,

My name is Kerry and I work for Blue Mountain Arts (the electronic
greeting card website). We are doing more and more programattic website
to website communication as we integrate new attachments and
functionality into our cards. Unfortunately, to date this has been done
in a piecemeal fashion but I would like to consolidate using a single
library (either public domain such as libwww, custom built, or purchased
if necessary). Since the majority of our CGI's are written in C and I am
not big on reinventing the wheel, I am interested in the C version of
libwww.

As we begin to add more attachments to our cards that require purchases,
i.e. chocolates, our need for SSL support is also growing.
I have obtained the latest versions of libwww and open SSL and have been
playing with some simple applications. So far the non-SSL applications
have been easy to use and haven behaved as expected. I was also able to
create a simple SSL application as well. However, when I started doing
multiple https requests, the application core dumps deep in the SSL
write functionality (in perusing mail lists this morning it looks like
this may have been found by others as well and fixed already).

This made me wonder how stable the SSL baseline is and whether anyone is
using it and libww extensively (e-groups mail list shows only 22
members). I would also be interested in knowing if anyone has any
benchmarks on the speed and stability of w3c as a whole. (i.e. has it
been checked for memory leaks, profiled for inefficient code, how often
are common interfaces changed/deprecated etc.)

Lib w3c seems like a pretty good framework but it is unclear from the
website how many users are actually using it, whether any of these uses
are for other than experimental purposes or in house tools, or how
actively the baseline is being maintained (most of the CVS activity
seems to be months old and the last news item on the website is a year
old though the SSL fix seemed to happen fairly quickly once noticed).

Before I recommend that Blue Mountain use this technology I'd like to
get a better feel for how the library is being used, if the SSL portion
is being used by anyone for actual e-commerce implementatinos, and what
people think of C libwww and its SSL interface. I would also be
interested in knowing if anyone is actively using the XML parsing
portion of the library as more of our interfaces to other websites seem
to be headed in that direction.

Any information that you could provide would be greatly appreciated. If
we do decide to use the technology we may be able to assist in
furthering the product.

Thanks

Kerry DeZell
Blue Mountain Arts

Received on Wednesday, 2 August 2000 12:06:15 UTC