- From: Eric W. Sink <eric@spyglass.com>
- Date: Wed, 13 Jul 1994 10:57:33 -0600
- To: timbl@www0.cern.ch
- Cc: www-lib@www0.cern.ch
>Need this be such a large task as it seems? Maybe... >These differences between browsers may look big, but in fact it >only takes a few lines of code to adapt the libwww machinery to >fit into them. > >If there are other problems then please list them here. Perhaps I should be specific. Here is an incomplete list of the kinds of things we have changed in our 2.15-derived library: ---------- Replace all memory allocation functions with macros which can expand to something other than malloc on systems where malloc is not usable (like the Mac). This means using W3_MALLOC, W3_FREE, W3_CALLOC and so on. Use the same idea for all the sockets calls, which also vary on non-UNIX platforms. Change all the source files names to be legal with MSDOS 8.3 filename limitations. Remove all the calls to fprintf(stderr, ...), which are not usable for error reporting on Mac or Windows. Add support for a different error reporting API which will work on all platforms. Something like ERR_ReportError(...) where the ERR_ API is implemented outside the library, probably in platform specific code. Remove all uses of the outofmem() macro, which basically calls fprintf and then exit()! Commercial software requires much cleaner error handling than this, particularly on Mac/Windows platforms. Define an API for progress indicators, adding calls throughout the library back into a platform-specific library of routines which keep the user informed of when things are happening. Our API includes support for thermometers which show percentage completion of a task, as well as a spinning NCSA-like globe which simply shows that something is happening. The actual presentation of the information could vary. That's simply the way we implemented the API. This same API polls for user aborts, so all operations are abortable and the termination of the network transfer (or whatever is happening) is handled cleanly. Remove all calls to getenv(), which is not portable to anything but UNIX. Our library references an externally defined structure which corresponds to user "Preferences". The same goes for system(). The implementation of "external viewers" is totally system dependent. Remove/fix all the places where a static local variable is malloc-ed but not free until the next time the function is executed. For example: int foo(void) { static char * mem; if (mem) { free(mem); mem = NULL; } mem = malloc(100); /* use mem for something, but don't free it */ } On Windows, this causes a memory leak, since Windows platforms do not release a process' memory when the process is killed. Fix the other memory leaks, including the free-ing of the anchors, atoms, suffix structures, and so on, which are allocated but never released. Toss out HTHistory or rewrite it so it can be used with a multi-window browser. If HTML.c is to be shared (and it could be), remove its assumptions about styles. In fact, none of the styles stuff in the library was useful for us. HTML.c now references styles by integer index, not by pointer, so the index can be used to find a style within a current style sheet, and the style sheet can be swapped with another easily. Don't define HTStream differently in multiple files. Most debuggers can't cope. Rearrange the include files so they don't #include each other so often. Add support for redirection and forms post. Make MIME type matching case-insensitive, as per RFC 1521. Cache the last call to gethostbyname() In SGML.c, add support for capturing the HTML source as it comes through, for supporting dialogs which allow the user to see the underlying HTML behind a page. #ifdef all the code which assumes all filenames are UNIX format. These sections have to be rewritten for Mac and Windows. ---------- Phew, that's an ugly list. As I look back on it, I notice that some of those things are not totally done yet. Some of them are simply bugs in the library which have been fixed in CERN's current releases. Some of them are rather nitpicky things that we did just because one day we got religious about some particular issue, like include files. Nonetheless, this is the scope of the changes we've made, and most of those changes were necessary. Feeding those "changes" back to CERN is certainly an option. (In fact, some code has already been sent back to CERN, so it is only a matter of time before those things are integrated into the official CERN releases.) But, right now, the diffs from 2.15 would be larger than the library itself. Another issue looming on the horizon is SECURITY. If we have to integrate S-HTTP into our libwww, the code will diverge even more. Will CERN want those kinds of changes too? >Have we got contacts for EINET people? On the list? John Hardin is involved with MacWeb, and he posts on the newsgroups a lot. I don't know if he's on the list or not. >There is no need to repeat effort *if the changes are folded in*. Agreed, but now that I've revealed the scale of the changes, I suspect that you may not WANT all the changes we've made. This is not a matter of our reluctance to release the code. The problem is that our version of libwww really improves *our* situation, but it may not improve CERN's. >The problem with the NCSA Mosaic libwwws is that there was no folding >in, and little effort to make the hooks needed fit in with a common >library -- witness the lack of commonality even within NCSA. >If CERN had had the manpower to go mine for the diffs and put them in >retorospecively then theings might have been differnt, but it doesn't >work unless there is some two-way communication: there are constraints >form both the app side and the lib side, and these have to be discussed. Also agreed. But hindsight is 20-20, and looking back, this did not happen like I hoped it would. When we started work on Mosaic, we abandoned NCSA's libwww and started with CERN's then-current 2.15. We really wanted more commonality with CERN and we wanted all our internal versions to be the same. I tried, from the beginning to minimize the changes to libwww, and tried to make the hooks fit in with a common library. I resolved to submit the diffs to CERN, but also to wait until I understood more of the library before doing so. I didn't want to burden the CERN staff with my own lack of knowledge of the code. The situation snuck up on me, and it got out of hand. Before I knew it, our library had so many changes, to submit the diffs would really be asking a lot of the CERN staff. Also, we had to add a number of "portable" calls to "non-portable" code. For example, our implementation of HTCopy() has a call to WAIT_ComeUpForAir() for user progress indication and abort polling. I can't very well ask CERN to put stuff like that in the library unless we're all going to agree that our WAIT_ API is the way to go, and provide a sample implementation of it. It would be arrogant to assume that all libwww users will be so thrilled with our particular strategy that it could be integrated into the library without discussion. >CERN had manpower problems, but with W3O that will >be relieved. And our attitude has always been to fold in anything >which people need (unless it it really dirty!) so that anyone who >has helped us fold in things can take future versions with zero changes. I hope that the above disclosure has been useful. I remain motivated to pursue collaboration on this library, but I think that simply mailing an enormous diff to CERN would be rather unfair. I believe that for our participation in W30's development of this code to be most beneficial for us and for others, we need a more proactive strategy, involving the kind of two-way communication you speak of. Feel free to correct me where my assessment of the situation is inaccurate. Eric W. Sink, Software Engineer -- eric@spyglass.com 217-355-6000 ext 237 All opinions expressed are mine, and may not be those of my employer. "Only academic people put cheese in their pocket." -SW, 24 May 1994
Received on Wednesday, 13 July 1994 17:57:04 UTC