W3C home > Mailing lists > Public > www-lib@w3.org > July to September 1994

Re: libwww

From: Henrik Frystyk Nielsen <frystyk@ptsun00.cern.ch>
Date: Tue, 12 Jul 94 15:11:49 +0200
Message-Id: <9407121311.AA17251@ptsun03.cern.ch>
To: hallam@dxal18.cern.ch, eric@spyglass.com
Cc: timbl@www0.cern.ch, timk@hook.spyglass.com, www-lib@www0.cern.ch
Hi

Thanks for joining me into the discussion! My comments (and some from
Tim) are merged into the text below:

> >        I am nearly finished with the integration of the Shen stuff into
> >libwww. The hooks are very general so that it should be possible to integrate
> >SHTTP into the same scheme. The initial release will be a medium security,
> >standalone system not needing any other encryption product (eg SecuDE or
> >an X500 directory) and with minimal complexity. The idea is to leverage
> >Certificate handling etc off this system.
>
> Sounds like a good strategy.  Our attitude toward all this security stuff is
> to try and keep it simple.
> 

We are preparing a release ot the World-Wide Web Common Code Library at
the end of this week. The version number will be 3.0pre1 as this
release contains two new importans features. First, as Phillip has
indicated, encryption/decryption functionality has been added, and
secondly the HTTP client has been rewritten so that it now has a
interruptable, multithreaded I/O interface. Even though FTP and Gopher
are fully prepared for multithreaded functionality, HTTP is the most
important and therefore has highest priority.

At the same time the current version 2.16pre2 will be upgrated to
version 2.17. The reason for this is that this version is quite stable
and is the basis of the CERN Proxy Server. However, the plan is to
merge the multithreaded HTTP (3.0pre1) client and the single threaded
HTTP Client (2.17) into one version so that the next major release
contains the possibility of both a single threaded and a multi threaded
HTTP client.

> >        The big question then is how hard would it be to rebong Mosaic
> >onto the current release of CERN libwww? This would give a number of advantages
> >such as improved ftp support, gopher support that is transparent so you
> >don't really notice that you are gophering and lots more. I'm not sure how
> >much
> >hacking Marc did to libwww2 and what variety it is. If it was bug patches and
> >such we are probably past the point where the differences need keeping.
> 
> Please bear with me as I digress a bit:
> 
> The main problem is that libwww is not the same across all Mosaics.
> NCSA's three versions of Mosaic use 3 different libwww implementations,
> all loosely based on your version 2.09, but with very different mods made.
> The Mac and UNIX versions differ the most, since neither of those platforms
> are actually using libwww for HTML parsing.  Upgrading either the Mac
> or UNIX version of NCSA Mosaic to any current CERN version will be a
> large task.

The parts of the NCSA libwww2 library that I have seen contains some
very specific Mosaic parts that in my opinion makes it difficult to
combine Mosaic with the Library 2.16pre2 version. Futhermore, a lot of
the code has been pretty printed so diffs are basicly impossible
between the two libraries. Version 2.15 introduced some new fundamental
data structures into the library and the NCSA version that I have seen
doesn't use this at all.

On the protocol side both libraies provide basically the same support
for WAIS, gopher, and local file access to the client. The FTP client
in the CERN library is more general and now the HTTP module is
multi-threaded.

> Spyglass' versions of Mosaic use yet another libwww.  When we started,
> we threw out NCSA's libwww code and started with version 2.15 from CERN.
> We have made a large number of modifications, mostly for portability.  We
> are using the exact same libwww code for our Mac, Windows, and UNIX versions.
> Our version of libwww is still structured very much like your 2.15.

Good - we have lost your mail message in a big pile but now it is there and
I will look at it. A big `bug-fix-difference' between the CERN version
2.15 and 2.16 release is that we started using Purify to clean up the
memory and it has taken out most leaks, uninitialised reads/writes etc.
etc.

For more information on the current state of the library can be found at

	http://info.cern.ch/hypertext/WWW/Library/Status.html

> My understanding is that Lynx and DOSLynx provide 2 more totally different
> versions
> of libwww.
> 

The Lynx version of the library is very close to the CERN version. The
basic structure is the same, but Lou Montulli has added a lot af fancy
features that not all are implemented in the CERN code. DOSlynx uses a
completely other library - I think it is written in C++...

> We believe it is important to get at least those 8 browsers using
> a common, shared code base for the non-GUI layers.  The 8 browsers I'm
> referring to are NCSA's 3 versions of Mosaic, Spyglass' 3 versions
> of Mosaic, and the 2 versions of Lynx.  We can certainly include others who want
> to cooperate, but I have contacted all the developers of these 8, and they are
> all interested in moving toward a common libwww code base.  Reaching that
> end is probably
> not likely to be as simple as just asking everyone to upgrade to 2.16.
>
> At the minimum, I think we can all agree that the state of affairs
> could be improved.  That state of affairs, if I may summarize
> it, is that CERN's libwww code has been hacked to create at least 9
> different versions, probably more.  Although all of these implementations
> share a common ancestry, they have all diverged to the extent that
> actually sharing the code, and integrating new releases will be non-trivial.

I agree! That's why I hope that our new mailing list
www-lib@info.cern.ch as a result of the WWW Conference at CERN will
make the communication better between the developers so that the
different versions eventually will converge. One of the reasons for it
being so quite until now is that Ari Luotonen has left CERN so the
developer group is now very small...

> Another variable to throw into the mix is W30.  TimBL is going to MIT.  Are
> you staying at CERN?  What about Henrik?  Who will be doing the primary
> development of libwww?  And where will that take place?

Tim says:  "I will be coordinating both sites in principle, with a lot of email
and flying.  There will be a local team leader here of course. Initial
funding should allow us to build up a team of around 7 people at CERN.
One should assume Henrik will be at CERN until one hears otherwise.
Others will join him.  Hakon Lie has joined us from Norweigan telecom for
a year."

As for my self - I have now less than three weeks to write my master thesis
so I guess that development will be quite dead until mid August.

> 
> We have made no firm decisions about any of this yet.  We're still wrestling
> with the issues, but the goals are clear:
> 
>   1.  We need a robust, maintainable body of code we can use, which
>         addresses our needs as a commercial provider of Mosaic.
> 
>   2.  We would like that same body of code to be useful and used by
>         the other providers of WWW software, including NCSA, CERN,
>         TeamLynx, and whoever else wants to play.
> 
>   3.  We want to collaborate with the other players, maintaining good
>         working relationships with NCSA, CERN, TeamLynx, W30, TBL,
>         and whoever else wants to play. :-)

Good. I think there is a strong feeling alongthose lines in most places,
which is largely why we started W3O  (and largely *how* we started W3!) - Tim.

-- cheers --

Henrik Frystyk


eturn-Path: doslynx@falcon.cc.ukans.edu 
Return-Path: <doslynx@falcon.cc.ukans.edu>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA13154; Tue, 12 Jul 1994 17:42:36 +0200
Received: from kuhub.cc.ukans.edu by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA19078; Tue, 12 Jul 1994 17:42:39 +0200
Received: from falcon.cc.ukans.edu by KUHUB.CC.UKANS.EDU (PMDF V4.3-8 #5489)
 id <01HEM6RMMO288WX6OZ@KUHUB.CC.UKANS.EDU>; Tue, 12 Jul 1994 10:42:01 CDT
Received: by falcon.cc.ukans.edu; id AA26051; Tue, 12 Jul 1994 10:42:04 -0500
Date: Tue, 12 Jul 1994 10:42:04 -0500 (CDT)
From: Garrett Arch Blythe <doslynx@falcon.cc.ukans.edu>
Subject: Re: libwww
In-Reply-To: <9407121311.AA17251@ptsun03.cern.ch>
To: www-lib@www0.cern.ch
Message-Id: <Pine.3.89.9407121006.E22352-0100000@falcon.cc.ukans.edu>
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Content-Transfer-Encoding: 7BIT
content-length: 2996

On Tue, 12 Jul 1994, Henrik Frystyk Nielsen wrote:
> Thanks for joining me into the discussion! My comments (and some from
> Tim) are merged into the text below:
> 
>	...
> 
> We are preparing a release ot the World-Wide Web Common Code Library at
> the end of this week. The version number will be 3.0pre1 as this
> release contains two new importans features. First, as Phillip has
> indicated, encryption/decryption functionality has been added, and
> secondly the HTTP client has been rewritten so that it now has a
> interruptable, multithreaded I/O interface. Even though FTP and Gopher
> are fully prepared for multithreaded functionality, HTTP is the most
> important and therefore has highest priority.

Timely....  I'm preparing for the merger and rewrite of the Lynxs.  Tell 
me, if I don't want the multi-threaded stuff in the library, is it easily 
turned off?  I'd look myself but I'm busy fixing Lynx 2-3.  Lynx and 
DosLynx each use their own versions of 2.14.

> At the same time the current version 2.16pre2 will be upgrated to
> version 2.17. The reason for this is that this version is quite stable
> and is the basis of the CERN Proxy Server. However, the plan is to
> merge the multithreaded HTTP (3.0pre1) client and the single threaded
> HTTP Client (2.17) into one version so that the next major release
> contains the possibility of both a single threaded and a multi threaded
> HTTP client.

Are you saying that if I don't want multi-threaded, I'll have to opt to 
use 2.17 for the time being?  I don't want multi-threaded since I am 
supporting DOS and VMS users.  How easy is the consolidation of the 
libraries going to be?  Am I going to have a lot of mods if I use 2.17 
and eventually want to use 3.* since it will then be the library you will 
be continually updating?

> For more information on the current state of the library can be found at
> 
> 	http://info.cern.ch/hypertext/WWW/Library/Status.html
> 
> > My understanding is that Lynx and DOSLynx provide 2 more totally different
> > versions
> > of libwww.
> > 
> 
> The Lynx version of the library is very close to the CERN version. The
> basic structure is the same, but Lou Montulli has added a lot af fancy
> features that not all are implemented in the CERN code. DOSlynx uses a
> completely other library - I think it is written in C++...

Hehe.  Don't be so sure.  DosLynx uses a mostly unmodified 2.14 with the 
usual massaging; the rest of it is C++, including the old GridText 
functions, callable by C functions via the 

extern "C"	{
	<your c++ code>
};

which turns off the c++ name mangling of identifiers;  global variables 
don't fall under this name mangling so you can use them anywhere (in DOS 
anyhow).

>	...

Yours,
	Garrett. 

Trodden Soil

I am trodden soil.
Dust covers my face.
Soles crush my nature
Revealing a hard empty space.

Garrett Arch Blythe  (913)864-0436
User Services Student Programmer/Consultant
University of Kansas Computer Center
<doslynx@falcon.cc.ukans.edu>

eturn-Path: eric@spyglass.com 
Return-Path: <eric@spyglass.com>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA23754; Tue, 12 Jul 1994 18:21:55 +0200
Received: from spyglass.com by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA24463; Tue, 12 Jul 1994 18:22:07 +0200
Received: by spyglass.com (5.57/3.1.090690-Spyglass)
	id AA07508; Tue, 12 Jul 94 11:22:47 -0500
Received: by hook.spyglass.com (5.57/3.1.090690-Spyglass)
	id AA05658; Tue, 12 Jul 94 11:22:45 -0500
Message-Id: <9407121622.AA05658@hook.spyglass.com>
X-Sender: eric@hook
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Date: Tue, 12 Jul 1994 11:22:27 -0600
To: frystyk@ptsun00.cern.ch (Henrik Frystyk Nielsen)
From: eric@spyglass.com (Eric W. Sink)
Subject: Re: libwww
Cc: www-lib@www0.cern.ch
content-length: 2731


>> >        The big question then is how hard would it be to rebong Mosaic
>> >onto the current release of CERN libwww? This would give a number of
>>advantages
>> >such as improved ftp support, gopher support that is transparent so you
>> >don't really notice that you are gophering and lots more. I'm not sure how
>> >much
>> >hacking Marc did to libwww2 and what variety it is. If it was bug patches
>>and
>> >such we are probably past the point where the differences need keeping.
>>
>> Please bear with me as I digress a bit:
>>
>> The main problem is that libwww is not the same across all Mosaics.
>> NCSA's three versions of Mosaic use 3 different libwww implementations,
>> all loosely based on your version 2.09, but with very different mods made.
>> The Mac and UNIX versions differ the most, since neither of those platforms
>> are actually using libwww for HTML parsing.  Upgrading either the Mac
>> or UNIX version of NCSA Mosaic to any current CERN version will be a
>> large task.
>
>The parts of the NCSA libwww2 library that I have seen contains some
>very specific Mosaic parts that in my opinion makes it difficult to
>combine Mosaic with the Library 2.16pre2 version. Futhermore, a lot of
>the code has been pretty printed so diffs are basicly impossible
>between the two libraries. Version 2.15 introduced some new fundamental
>data structures into the library and the NCSA version that I have seen
>doesn't use this at all.

I'm CCing the list on this to invite a wider base of opinions on this
situation.  The issue is the sharing of a common code base for W3 software.
The previous message to this list mentioned CERN's soon-to-be-released
versions 2.17 and 3.0 of libwww.  The problem is that most browsers which
use libwww have had to modify it considerably, especially the browsers which
are running on non-UNIX platforms.

In addition to mentioning the NCSA versions of Mosaic and ours, Spry's
AirMosaic is still based on libwww 2.09, as far as I know (Chris?).  EINet's
MacWeb and forthcoming WinWeb appear to be based on CERN code.  I presume
they found it necessary to make substantial modifications as well.

What *actions* can we take to remedy this situation?  Given the number of
code changes that have taken place in the various incarnations of libwww,
I do not think it is likely that any non-UNIX browser will be able to
upgrade to 2.17.  At the least, it will require substantial effort, and
we don't want that effort to be repeated when 2.18 comes out.

Eric W. Sink, Software Engineer --  eric@spyglass.com 217-355-6000 ext 237
All opinions expressed are mine, and may not be those of my employer.
        "Only academic people put cheese in their pocket."
            -SW, 24 May 1994


eturn-Path: doslynx@falcon.cc.ukans.edu 
Return-Path: <doslynx@falcon.cc.ukans.edu>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA04114; Tue, 12 Jul 1994 18:56:32 +0200
Received: from kuhub.cc.ukans.edu by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA28130; Tue, 12 Jul 1994 18:56:48 +0200
Received: from falcon.cc.ukans.edu by KUHUB.CC.UKANS.EDU (PMDF V4.3-8 #5489)
 id <01HEM9CL7A8W8WXGKB@KUHUB.CC.UKANS.EDU>; Tue, 12 Jul 1994 11:56:11 CDT
Received: by falcon.cc.ukans.edu; id AA00239; Tue, 12 Jul 1994 11:56:12 -0500
Date: Tue, 12 Jul 1994 11:56:11 -0500 (CDT)
From: Garrett Arch Blythe <doslynx@falcon.cc.ukans.edu>
Subject: Re: libwww
In-Reply-To: <9407121622.AA05658@hook.spyglass.com>
To: Multiple recipients of list <www-lib@www0.cern.ch>
Message-Id: <Pine.3.89.9407121151.A28510-0100000@falcon.cc.ukans.edu>
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Content-Transfer-Encoding: 7BIT
content-length: 2727

On Tue, 12 Jul 1994, Eric W. Sink wrote:
>	...
> 
> In addition to mentioning the NCSA versions of Mosaic and ours, Spry's
> AirMosaic is still based on libwww 2.09, as far as I know (Chris?).  EINet's
> MacWeb and forthcoming WinWeb appear to be based on CERN code.  I presume
> they found it necessary to make substantial modifications as well.
> 
> What *actions* can we take to remedy this situation?  Given the number of
> code changes that have taken place in the various incarnations of libwww,
> I do not think it is likely that any non-UNIX browser will be able to
> upgrade to 2.17.  At the least, it will require substantial effort, and
> we don't want that effort to be repeated when 2.18 comes out.

I think it is fairly obvious that all the changes that were made to these 
libraries were not coordinated correctly with CERN for their codebase; on 
the other hand perhaps they were and CERN didn't put them in and still 
expects us all to make these changes everytime they put out a new library 
(Lou had mentioned this once as a reason why we were stuck with 2.14, but 
then I wouldn't approve all of the Lynx hacks either).

Sure, all the clients chew up the code a bit for their specific purposes,
and some lose large chunks of it, but what should have been done in order
to keep everyone consistant is to give the modifications made to their
library on your specific systems to them for their review and future
inclusion while trying to keep as much as the common code as possible.

If this isn't done, then don't expect to be compatible.

After submission, if the mods aren't included into the libwww and a
compliant solution isn't found by coordinating with CERN, then perhaps it
is best to split from CERN and do it yourself.  This goes against the
purpose of this mailing list and I am assuming, mind you, that CERN is
willing to bend over and comply to our wishes and modifications.  I am 
very skeptical about CERN having the manpower required to do this.

CERN, is it feasible?
Is this the purpose of the libwww and of this mailing list?

If this doesn't happen by either our faults or CERN's fault, this idea of 
a common code base is going to fail repeatedly.

Perhaps we, the developers, need a very specific set of coding conventions
when modifying the CERN library to keep it unfragmented; and therefore 
allowing us to make consistent code contributions to CERN for easier 
inclusion.  What would these conventions be?


Honestly,
	Garrett.

Trodden Soil

I am trodden soil.
Dust covers my face.
Soles crush my nature
Revealing a hard empty space.

Garrett Arch Blythe  (913)864-0436
User Services Student Programmer/Consultant
University of Kansas Computer Center
<doslynx@falcon.cc.ukans.edu>

eturn-Path: frystyk@ptsun00.cern.ch 
Return-Path: <frystyk@ptsun00.cern.ch>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA01254; Tue, 12 Jul 1994 20:48:21 +0200
Received: from ptsun00.cern.ch by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA09198; Tue, 12 Jul 1994 20:48:35 +0200
Received: from ptsun03.cern.ch by ptsun00.cern.ch (4.1/SMI-4.1)
	id AA21409; Tue, 12 Jul 94 20:47:30 +0200
From: frystyk@ptsun00.cern.ch (Henrik Frystyk Nielsen)
Received: by ptsun03.cern.ch (4.1/client-1.5)
	id AA20054; Tue, 12 Jul 94 20:47:28 +0200
Date: Tue, 12 Jul 94 20:47:28 +0200
Message-Id: <9407121847.AA20054@ptsun03.cern.ch>
To: www-lib@www0.cern.ch
Subject: Re: libwww
content-length: 3591

>
> On Tue, 12 Jul 1994, Henrik Frystyk Nielsen wrote:
> > Thanks for joining me into the discussion! My comments (and some from
> > Tim) are merged into the text below:
> > 
> >	...
> > 
> > We are preparing a release ot the World-Wide Web Common Code Library at
> > the end of this week. The version number will be 3.0pre1 as this
> > release contains two new importans features. First, as Phillip has
> > indicated, encryption/decryption functionality has been added, and
> > secondly the HTTP client has been rewritten so that it now has a
> > interruptable, multithreaded I/O interface. Even though FTP and Gopher
> > are fully prepared for multithreaded functionality, HTTP is the most
> > important and therefore has highest priority.
> 
> Timely....  I'm preparing for the merger and rewrite of the Lynxs.  Tell 
> me, if I don't want the multi-threaded stuff in the library, is it easily 
> turned off?  I'd look myself but I'm busy fixing Lynx 2-3.  Lynx and 
> DosLynx each use their own versions of 2.14.
>
> > At the same time the current version 2.16pre2 will be upgrated to
> > version 2.17. The reason for this is that this version is quite stable
> > and is the basis of the CERN Proxy Server. However, the plan is to
> > merge the multithreaded HTTP (3.0pre1) client and the single threaded
> > HTTP Client (2.17) into one version so that the next major release
> > contains the possibility of both a single threaded and a multi threaded
> > HTTP client.
> 
> Are you saying that if I don't want multi-threaded, I'll have to opt to 
> use 2.17 for the time being?  I don't want multi-threaded since I am 
> supporting DOS and VMS users.  How easy is the consolidation of the 
> libraries going to be?  Am I going to have a lot of mods if I use 2.17 
> and eventually want to use 3.* since it will then be the library you will 
> be continually updating?

The multi threaded HTTP client has been made `by hand' - that is it
uses one stack and no special thread package. This makes it possible to
use - even on a PC ;-) Take a look at the specification at

http://info.cern.ch/hypertext/WWW/Library/User/Multithread/multithread.html

> 
> > For more information on the current state of the library can be found at
> > 
> > 	http://info.cern.ch/hypertext/WWW/Library/Status.html
> > 
> > > My understanding is that Lynx and DOSLynx provide 2 more totally different
> > > versions
> > > of libwww.
> > > 
> > 
> > The Lynx version of the library is very close to the CERN version. The
> > basic structure is the same, but Lou Montulli has added a lot af fancy
> > features that not all are implemented in the CERN code. DOSlynx uses a
> > completely other library - I think it is written in C++...
> 
> Hehe.  Don't be so sure.  DosLynx uses a mostly unmodified 2.14 with the 
> usual massaging; the rest of it is C++, including the old GridText 
> functions, callable by C functions via the 
> 
> extern "C"	{
> 	<your c++ code>
> };
>
> which turns off the c++ name mangling of identifiers;  global variables 
> don't fall under this name mangling so you can use them anywhere (in DOS 
> anyhow).

Oh - this is new to me, but it sounds good!

****

I have also prepared some other information on the current state of the
Common Code Library. Please take a look at

http://info.cern.ch/hypertext/WWW/Library/User/Features/Implementation.html

for more information. Comments and ideas are very welcome!

I hope to have a draft ready tomorrow about a general PUT and POST
implementation that combines News, Mail and HTTP using the same
scheme.

-- cheers --

Henrik Frystyk




eturn-Path: timbl@www3.cern.ch 
Return-Path: <timbl@www3.cern.ch>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA02834; Wed, 13 Jul 1994 16:52:23 +0200
Received: from www3.cern.ch by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA17816; Wed, 13 Jul 1994 16:52:39 +0200
Received: by www3.cern.ch (NX5.67c/NX3.0S)
	id AA03195; Wed, 13 Jul 94 16:48:37 +0200
Date: Wed, 13 Jul 94 16:48:37 +0200
From: Tim Berners-Lee <timbl@www3.cern.ch>
Message-Id: <9407131448.AA03195@www3.cern.ch>
Received: by NeXT.Mailer (1.87.1)
Received: by NeXT Mailer (1.87.1)
To: eric@spyglass.com
Subject: Re: libwww
Cc: www-lib@www0.cern.ch
Reply-To: timbl@www0.cern.ch
content-length: 4202



PHB:
|>> >        The big question then is how hard would it be to rebong Mosaic
|>> >onto the current release of CERN libwww?

EWS:
|>> Please bear with me as I digress a bit:
|>>
|>> The main problem is that libwww is not the same across all Mosaics.
|>> NCSA's three versions of Mosaic use 3 different libwww implementations,
|>> all loosely based on your version 2.09, but with very different mods  
made.
|>> The Mac and UNIX versions differ the most, since neither of those  
platforms
|>> are actually using libwww for HTML parsing.  Upgrading either the Mac
|>> or UNIX version of NCSA Mosaic to any current CERN version will be a
|>> large task.

Need this be such a large task as it seems?
Basically, if I understand it right, the problem is that
within libwww after 2.14, the protocol modules output structued
streams (ie parsed SGML token streams) as opposed the unparsed streams
of previous versions.  The Mac and X Mosaic hypertext widgets do their
own parsing, and so expect an unparsed stream.

No problem... it is trivial to get a unparsed stream, by tacking on
the HTMLGen module. This means just changing HTML_new(HTStructured*)
call HTMLGen to get a normal stream, and then set up the widget
to take input from the stream.

	HTML_new(HTStructured s) {
	   return MyOldHTML_new(HTMLGen(s))
	}
	
libwww is event-driven.  That is, the streams are writeable,
and network events trigger the pipeline of processing,
which ends up driving the widget. If a browser has diverged to the extent
that the widget needs itself to loop, calling a getc() routine or
whatever, then one can *even* fix that, but you need the multithreaded
3.0 libwww. Basically, you make
a simple buffer stream which accepts the output from the libwww machinery.
When the widget gets control, it calls some read routines which check
the buffer and if it is empty call
the libwww event handler until some data has appeared.

These differences between browsers may look big, but in fact it
only takes a few lines of code to adapt the libwww machinery to
fit into them.

If there are other problems then please list them here.

EWS:
|I'm CCing the list on this to invite a wider base of opinions on this
|situation.  The issue is the sharing of a common code base for W3 software.
|The previous message to this list mentioned CERN's soon-to-be-released
|versions 2.17 and 3.0 of libwww.  The problem is that most browsers which
|use libwww have had to modify it considerably, especially the browsers which
|are running on non-UNIX platforms.
|
|In addition to mentioning the NCSA versions of Mosaic and ours, Spry's
|AirMosaic is still based on libwww 2.09, as far as I know (Chris?).  EINet's
|MacWeb and forthcoming WinWeb appear to be based on CERN code.  I presume
|they found it necessary to make substantial modifications as well.

Have we got contacts for EINET people? On the list?

|What *actions* can we take to remedy this situation?  Given the number of
|code changes that have taken place in the various incarnations of libwww,
|I do not think it is likely that any non-UNIX browser will be able to
|upgrade to 2.17.  At the least, it will require substantial effort, and
|we don't want that effort to be repeated when 2.18 comes out.

There is no need to repeat effort *if the changes are folded in*.
The problem with the NCSA Mosaic libwwws is that there was no folding
in, and little effort to make the hooks needed fit in with a common
library -- witness the lack of commonality even within NCSA.
If CERN had had the manpower to go mine for the diffs and put them in
retorospecively then theings might have been differnt, but it doesn't
work unless there is some two-way communication: there are constraints
form both the app side and the lib side, and these have to be discussed.

CERN had manpower problems, but with W3O that will
be relieved.  And our attitude has always been to fold in anything
which people need (unless it it really dirty!) so that anyone who
has helped us fold in things can take future versions with zero changes.

Obviously occasionally there have been changes to the API.  These have been
occasional, and the API is very similar to what it was at 2.0.

Tim Berners-Lee
CERN
eturn-Path: eric@spyglass.com 
Return-Path: <eric@spyglass.com>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA23422; Wed, 13 Jul 1994 17:57:04 +0200
Received: from spyglass.com by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA00150; Wed, 13 Jul 1994 17:57:15 +0200
Received: by spyglass.com (5.57/3.1.090690-Spyglass)
	id AA09849; Wed, 13 Jul 94 10:57:52 -0500
Received: by hook.spyglass.com (5.57/3.1.090690-Spyglass)
	id AA09827; Wed, 13 Jul 94 10:57:49 -0500
Message-Id: <9407131557.AA09827@hook.spyglass.com>
X-Sender: eric@hook
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Date: Wed, 13 Jul 1994 10:57:33 -0600
To: timbl@www0.cern.ch
From: eric@spyglass.com (Eric W. Sink)
Subject: Re: libwww
Cc: www-lib@www0.cern.ch
content-length: 8199


>Need this be such a large task as it seems?

Maybe...

>These differences between browsers may look big, but in fact it
>only takes a few lines of code to adapt the libwww machinery to
>fit into them.
>
>If there are other problems then please list them here.

Perhaps I should be specific.  Here is an incomplete list of the
kinds of things we have changed in our 2.15-derived library:

----------
Replace all memory allocation functions with macros which can
expand to something other than malloc on systems where malloc
is not usable (like the Mac).  This means using W3_MALLOC, W3_FREE,
W3_CALLOC and so on.

Use the same idea for all the sockets calls, which also vary on
non-UNIX platforms.

Change all the source files names to be legal with MSDOS 8.3 filename
limitations.

Remove all the calls to fprintf(stderr, ...), which are not usable
for error reporting on Mac or Windows.

Add support for a different error reporting API which will work on
all platforms.  Something like ERR_ReportError(...) where the ERR_
API is implemented outside the library, probably in platform specific
code.

Remove all uses of the outofmem() macro, which basically calls fprintf
and then exit()!  Commercial software requires much cleaner error
handling than this, particularly on Mac/Windows platforms.

Define an API for progress indicators, adding calls throughout the
library back into a platform-specific library of routines which keep
the user informed of when things are happening.  Our API includes
support for thermometers which show percentage completion of a task,
as well as a spinning NCSA-like globe which simply shows that something
is happening.  The actual presentation of the information could vary.
That's simply the way we implemented the API.  This same API polls
for user aborts, so all operations are abortable and the termination
of the network transfer (or whatever is happening) is handled cleanly.

Remove all calls to getenv(), which is not portable to anything but
UNIX.  Our library references an externally defined structure which
corresponds to user "Preferences".

The same goes for system().  The implementation of "external viewers"
is totally system dependent.

Remove/fix all the places where a static local variable is malloc-ed
but not free until the next time the function is executed.  For example:

        int foo(void)
        {
                static char * mem;

                if (mem)
                {
                        free(mem);
                        mem = NULL;
                }
                mem = malloc(100);

                /* use mem for something, but don't free it */
        }

On Windows, this causes a memory leak, since Windows platforms do
not release a process' memory when the process is killed.

Fix the other memory leaks, including the free-ing of the anchors,
atoms, suffix structures, and so on, which are allocated but never
released.

Toss out HTHistory or rewrite it so it can be used with a multi-window
browser.

If HTML.c is to be shared (and it could be), remove its assumptions
about styles.  In fact, none of the styles stuff in the library was
useful for us.  HTML.c now references styles by integer index, not
by pointer, so the index can be used to find a style within a current
style sheet, and the style sheet can be swapped with another easily.

Don't define HTStream differently in multiple files.  Most debuggers
can't cope.

Rearrange the include files so they don't #include each other so
often.

Add support for redirection and forms post.

Make MIME type matching case-insensitive, as per RFC 1521.

Cache the last call to gethostbyname()

In SGML.c, add support for capturing the HTML source as it comes through,
for supporting dialogs which allow the user to see the underlying HTML
behind a page.

#ifdef all the code which assumes all filenames are UNIX format.
These sections have to be rewritten for Mac and Windows.

----------

Phew, that's an ugly list.  As I look back on it, I notice that some
of those things are not totally done yet.  Some of them are simply
bugs in the library which have been fixed in CERN's current releases.
Some of them are rather nitpicky things that we did just because one day
we got religious about some particular issue, like include files.

Nonetheless, this is the scope of the changes we've made, and most of
those changes were necessary.  Feeding those "changes" back to CERN
is certainly an option.  (In fact, some code has already been sent back
to CERN, so it is only a matter of time before those things are
integrated into the official CERN releases.)  But, right now, the
diffs from 2.15 would be larger than the library itself.

Another issue looming on the horizon is SECURITY.  If we have to integrate
S-HTTP into our libwww, the code will diverge even more.  Will CERN
want those kinds of changes too?

>Have we got contacts for EINET people? On the list?

John Hardin is involved with MacWeb, and he posts on the newsgroups
a lot.  I don't know if he's on the list or not.

>There is no need to repeat effort *if the changes are folded in*.

Agreed, but now that I've revealed the scale of the changes, I
suspect that you may not WANT all the changes we've made.  This is
not a matter of our reluctance to release the code.  The problem
is that our version of libwww really improves *our* situation, but it
may not improve CERN's.

>The problem with the NCSA Mosaic libwwws is that there was no folding
>in, and little effort to make the hooks needed fit in with a common
>library -- witness the lack of commonality even within NCSA.
>If CERN had had the manpower to go mine for the diffs and put them in
>retorospecively then theings might have been differnt, but it doesn't
>work unless there is some two-way communication: there are constraints
>form both the app side and the lib side, and these have to be discussed.

Also agreed.  But hindsight is 20-20, and looking back, this did not
happen like I hoped it would.  When we started work on Mosaic, we abandoned
NCSA's libwww and started with CERN's then-current 2.15.  We really wanted
more commonality with CERN and we wanted all our internal versions
to be the same.  I tried, from the beginning to minimize the changes
to libwww, and tried to make the hooks fit in with a common library.

I resolved to submit the diffs to CERN, but also to wait until I understood
more of the library before doing so.  I didn't want to burden the CERN
staff with my own lack of knowledge of the code.

The situation snuck up on me, and it got out of hand.  Before I knew it,
our library had so many changes, to submit the diffs would really be
asking a lot of the CERN staff.  Also, we had to add a number of "portable"
calls to "non-portable" code.  For example, our implementation of HTCopy()
has a call to WAIT_ComeUpForAir() for user progress indication and abort
polling.  I can't very well ask CERN to put stuff like that in the library
unless we're all going to agree that our WAIT_ API is the way to go,
and provide a sample implementation of it.  It would be arrogant to assume
that all libwww users will be so thrilled with our particular strategy
that it could be integrated into the library without discussion.

>CERN had manpower problems, but with W3O that will
>be relieved.  And our attitude has always been to fold in anything
>which people need (unless it it really dirty!) so that anyone who
>has helped us fold in things can take future versions with zero changes.

I hope that the above disclosure has been useful.  I remain motivated to
pursue collaboration on this library, but I think that simply mailing
an enormous diff to CERN would be rather unfair.  I believe that for
our participation in W30's development of this code to be most beneficial
for us and for others, we need a more proactive strategy, involving the
kind of two-way communication you speak of.

Feel free to correct me where my assessment of the situation is inaccurate.

Eric W. Sink, Software Engineer --  eric@spyglass.com 217-355-6000 ext 237
All opinions expressed are mine, and may not be those of my employer.
        "Only academic people put cheese in their pocket."
            -SW, 24 May 1994


eturn-Path: hallam@alws.cern.ch 
Return-Path: <hallam@alws.cern.ch>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA28255; Wed, 13 Jul 1994 18:18:55 +0200
Received: from ALWS.DECnet MAIL11D_V3 by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA02880; Wed, 13 Jul 1994 18:18:56 +0200
Date: Wed, 13 Jul 1994 18:18:56 +0200
Message-Id: <9407131618.AA02880@dxmint.cern.ch>
From: hallam@alws.cern.ch
X-Vms-To: DXMINT::eric@spyglass.com,DXMINT::www-lib@www0.cern.ch
X-Vms-Cc: HALLAM
Subject: Re: libwww
X-Mail11-Ostype: VAX/VMS
Apparently-To: <www-lib@www0.cern.ch>
Apparently-To: <eric@spyglass.com>
content-length: 4526

----------
>Replace all memory allocation functions with macros which can
>expand to something other than malloc on systems where malloc
>is not usable (like the Mac).  This means using W3_MALLOC, W3_FREE,
>W3_CALLOC and so on.

This could be overriddent in HTUtils.h where there are macros. I think
it should be better not to use malloc as the name though so that a 
collection of source can be checked with a grep for stray mallocs...


>Change all the source files names to be legal with MSDOS 8.3 filename
>limitations.

Can do..

>Add support for a different error reporting API which will work on
>all platforms.  Something like ERR_ReportError(...) where the ERR_
>API is implemented outside the library, probably in platform specific
>code.

Henrik did something like this...

>Remove all uses of the outofmem() macro, which basically calls fprintf
>and then exit()!  Commercial software requires much cleaner error
>handling than this, particularly on Mac/Windows platforms.

Agree... I don't like this either...

>Define an API for progress indicators, adding calls throughout the
>library back into a platform-specific library of routines which keep
>the user informed of when things are happening.  Our API includes
>support for thermometers which show percentage completion of a task,
>as well as a spinning NCSA-like globe which simply shows that something
>is happening.  The actual presentation of the information could vary.
>That's simply the way we implemented the API.  This same API polls
>for user aborts, so all operations are abortable and the termination
>of the network transfer (or whatever is happening) is handled cleanly.

>Remove all calls to getenv(), which is not portable to anything but
>UNIX.  Our library references an externally defined structure which
>corresponds to user "Preferences".

Well it does work on VMS :-)

Should be more general though. Under X11 you want to use Xt resources
etc etc..

Perhaps as a stopgap we could introduce a GETENV macro???


>The same goes for system().  The implementation of "external viewers"
>is totally system dependent.

Agree!


>Remove/fix all the places where a static local variable is malloc-ed
>but not free until the next time the function is executed.  For example:
>
>On Windows, this causes a memory leak, since Windows platforms do
>not release a process' memory when the process is killed.

Ask Henrik...


>Fix the other memory leaks, including the free-ing of the anchors,
>atoms, suffix structures, and so on, which are allocated but never
>released.

Should have been done!!!

>Toss out HTHistory or rewrite it so it can be used with a multi-window
>browser.

Good point!

>If HTML.c is to be shared (and it could be), remove its assumptions
>about styles.  In fact, none of the styles stuff in the library was
>useful for us.  HTML.c now references styles by integer index, not
>by pointer, so the index can be used to find a style within a current
>style sheet, and the style sheet can be swapped with another easily.
>
>Don't define HTStream differently in multiple files.  Most debuggers
>can't cope.

That is the last vestige of object orientedness :-)...

>Rearrange the include files so they don't #include each other so
>often.

Shen has lots of include files.... 

I have been putting checks to see if a #include has been included already
before including it from an include file. Prevents accidental double loading
before the lexical analyser stage (ie its fast)... We X11 people have all
the X11 sillyness so optimising libwww in this way would not make much 
difference. On a Mac...

>Add support for redirection and forms post.

Agree it should be there....

>Make MIME type matching case-insensitive, as per RFC 1521.

We want to completely rewrite the MIME parser proper so there is only one of 
them.


>Cache the last call to gethostbyname()

Hah! Have we got caching for you... H. can tell you more but it is pretty
nifty all round!

>In SGML.c, add support for capturing the HTML source as it comes through,
>for supporting dialogs which allow the user to see the underlying HTML
>behind a page.

How about a Tee Stream? In the short term it might be simpler to have a
`migration' library where the SGML stuff can be overidden and only raw ASCII
cometh out. 


>#ifdef all the code which assumes all filenames are UNIX format.
>These sections have to be rewritten for Mac and Windows.

We should move to 8x3 filnames which should help... Rest should be possible...




Hmm... Dosen't look Tooooo bad...


Phill
eturn-Path: frystyk@ptsun00.cern.ch 
Return-Path: <frystyk@ptsun00.cern.ch>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA12611; Wed, 13 Jul 1994 19:31:08 +0200
Received: from ptsun00.cern.ch by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA10402; Wed, 13 Jul 1994 19:31:23 +0200
Received: from ptsun03.cern.ch by ptsun00.cern.ch (4.1/SMI-4.1)
	id AA08289; Wed, 13 Jul 94 19:30:18 +0200
From: frystyk@ptsun00.cern.ch (Henrik Frystyk Nielsen)
Received: by ptsun03.cern.ch (4.1/client-1.5)
	id AA20793; Wed, 13 Jul 94 19:30:17 +0200
Date: Wed, 13 Jul 94 19:30:17 +0200
Message-Id: <9407131730.AA20793@ptsun03.cern.ch>
To: www-lib@www0.cern.ch
Subject: Re: libwww
content-length: 7307


I can add some comments about the current state of the library:

> Replace all memory allocation functions with macros which can
> expand to something other than malloc on systems where malloc
> is not usable (like the Mac).  This means using W3_MALLOC, W3_FREE,
> W3_CALLOC and so on.

This is still not done

> Use the same idea for all the sockets calls, which also vary on
> non-UNIX platforms.

All accept and connect() calls now go through a single function in
HTTCP.c READ, WRITE and CLOSE are used througout the library but are
macros, so they should be `transformable'

> Change all the source files names to be legal with MSDOS 8.3 filename
> limitations.

This should be a minor task...
 
> Remove all the calls to fprintf(stderr, ...), which are not usable
> for error reporting on Mac or Windows.

This is more serious as there are 1.000.000's of them right now. However,
again I think a macro substitution would solve the problem

> Add support for a different error reporting API which will work on
> all platforms.  Something like ERR_ReportError(...) where the ERR_
> API is implemented outside the library, probably in platform specific
> code.

Is this the fprintf(stderr problem??? Have you seen the new error/information
parsing module that gives the user information on what's going on?

> Remove all uses of the outofmem() macro, which basically calls fprintf
> and then exit()!  Commercial software requires much cleaner error
> handling than this, particularly on Mac/Windows platforms.

Right! - the library is basically foreseen to work on a platform with
lots of memory...

> Define an API for progress indicators, adding calls throughout the
> library back into a platform-specific library of routines which keep
> the user informed of when things are happening.  Our API includes
> support for thermometers which show percentage completion of a task,
> as well as a spinning NCSA-like globe which simply shows that something
> is happening.  The actual presentation of the information could vary.
> That's simply the way we implemented the API.  This same API polls
> for user aborts, so all operations are abortable and the termination
> of the network transfer (or whatever is happening) is handled cleanly.

This is on my working list and should be easy to do. Basically it would
be to add HTProgress() throughout the library.

> Remove all calls to getenv(), which is not portable to anything but
> UNIX.  Our library references an externally defined structure which
> corresponds to user "Preferences".
> 
> The same goes for system().  The implementation of "external viewers"
> is totally system dependent.

Definitely, we have a problem with system calls. It could be solved using
#ifdef's but I don't think this is satisfactory for DOS-people. I am not sure
what to do..

> Remove/fix all the places where a static local variable is malloc-ed
> but not free until the next time the function is executed.  For example:
> 
>         int foo(void)
>         {
>                 static char * mem;
> 
>                 if (mem)
>                 {
>                         free(mem);
>                         mem = NULL;
>                 }
>                 mem = malloc(100);
> 
>                 /* use mem for something, but don't free it */
>         }
> 
> On Windows, this causes a memory leak, since Windows platforms do
> not release a process' memory when the process is killed.

I was not aware of this problem... sure it would take some work, but it
is possible...

> Fix the other memory leaks, including the free-ing of the anchors,
> atoms, suffix structures, and so on, which are allocated but never
> released.

Again - on UNIX they are not really `leaks' ;-)

> Toss out HTHistory or rewrite it so it can be used with a multi-window
> browser.

Remember it should also be used on a character based, single window
implementation, like the Line Mode Browser. 

> If HTML.c is to be shared (and it could be), remove its assumptions
> about styles.  In fact, none of the styles stuff in the library was
> useful for us.  HTML.c now references styles by integer index, not
> by pointer, so the index can be used to find a style within a current
> style sheet, and the style sheet can be swapped with another easily.
> 
> Don't define HTStream differently in multiple files.  Most debuggers
> can't cope.
> 
> Rearrange the include files so they don't #include each other so
> often.
> 
> Add support for redirection and forms post.

You mean clean up the code a bit...

> Make MIME type matching case-insensitive, as per RFC 1521.

We have a new and better MIME parser on the working list. For the moment
there are 1.000's of MIME-parsers in the library. One general parser
would be nice. Then optimizations like case insensitivity also become
more appropriate.

> Cache the last call to gethostbyname()

This is done in the 2.16 version. It also supports multi homed hosts and on
each connect it measures the connect time on a given IP-address. The next
connect then takes the fastest IP-address.

> In SGML.c, add support for capturing the HTML source as it comes through,
> for supporting dialogs which allow the user to see the underlying HTML
> behind a page.

The SGML/HTML needs to be looked at. It should also be upgrated to HTML+

> #ifdef all the code which assumes all filenames are UNIX format.
> These sections have to be rewritten for Mac and Windows.

The library hasn't been compiled on a PC/Mac for a long time. A general
port is necessary.

> ----------
> 
> Phew, that's an ugly list.  As I look back on it, I notice that some
> of those things are not totally done yet.  Some of them are simply
> bugs in the library which have been fixed in CERN's current releases.
> Some of them are rather nitpicky things that we did just because one day
> we got religious about some particular issue, like include files.
> 
> Nonetheless, this is the scope of the changes we've made, and most of
> those changes were necessary.  Feeding those "changes" back to CERN
> is certainly an option.  (In fact, some code has already been sent back
> to CERN, so it is only a matter of time before those things are
> integrated into the official CERN releases.)  But, right now, the
> diffs from 2.15 would be larger than the library itself.

Quite a lot has happened from version 2.15 to the current version. I
don't expect that people just throw their version of the library away
and starts using the current CERN version. It sure has a lack of
functionality! My idea with this mailing list is to start the process
of converging the different versions. In your list above I find nothing
that can't be done - and it should only be done once in order to work.
That is in otehr words, I am very interested in getting response on new
features and diffs.

In response I hope that some of the new features that now are
implemented in the library makes it attractive for you to use as a
basis for new development.

You are right about the difficulties supporting all platforms. In
practice I think it is unrealistic. However, I believe that all
platforms have a common core of functionality that can be shared
between them, but this requires that the library stays as general as
possible and doesn't try to do any fancy things.

-- cheers --

Henrik Frystyk



eturn-Path: frystyk@ptsun00.cern.ch 
Return-Path: <frystyk@ptsun00.cern.ch>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA18130; Wed, 13 Jul 1994 19:52:21 +0200
Received: from ptsun00.cern.ch by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA13888; Wed, 13 Jul 1994 19:52:37 +0200
Received: from ptsun03.cern.ch by ptsun00.cern.ch (4.1/SMI-4.1)
	id AA08653; Wed, 13 Jul 94 19:51:32 +0200
From: frystyk@ptsun00.cern.ch (Henrik Frystyk Nielsen)
Received: by ptsun03.cern.ch (4.1/client-1.5)
	id AA20937; Wed, 13 Jul 94 19:51:31 +0200
Date: Wed, 13 Jul 94 19:51:31 +0200
Message-Id: <9407131751.AA20937@ptsun03.cern.ch>
To: www-lib@www0.cern.ch
Subject: Working list
content-length: 552



In order not to just talk about the suggestion but actually _do_
something about them - I think we need a way to put up a common `to do'
list and a working list where all ongoing projects and improvements are
added. This should be possible as the list is quite small (currently 25
members)

Also the problem about exchanging diffs and new modules. We use
currently a CVS system but that doesn't work over the net. Maybe a 
combination of HTTP and CVS would do???

Any ideas... (please realistic -- not too fancy ;-))

-- cheers --

Henrik Frystyk
 

eturn-Path: hallam@alws.cern.ch 
Return-Path: <hallam@alws.cern.ch>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA20514; Wed, 13 Jul 1994 20:00:47 +0200
Received: from ALWS.DECnet MAIL11D_V3 by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA14325; Wed, 13 Jul 1994 20:01:03 +0200
Date: Wed, 13 Jul 1994 20:01:03 +0200
Message-Id: <9407131801.AA14325@dxmint.cern.ch>
From: hallam@alws.cern.ch
X-Vms-To: DXMINT::www-lib@www0.cern.ch
X-Vms-Cc: HALLAM
Subject: RE: Working list
X-Mail11-Ostype: VAX/VMS
Apparently-To: <www-lib@www0.cern.ch>
content-length: 883

>In order not to just talk about the suggestion but actually _do_
>something about them - I think we need a way to put up a common `to do'
>list and a working list where all ongoing projects and improvements are
>added. This should be possible as the list is quite small (currently 25
>members)

This would be a good idea regardless!

>Also the problem about exchanging diffs and new modules. We use
>currently a CVS system but that doesn't work over the net. Maybe a 
>combination of HTTP and CVS would do???

I have been thinking of how to convert some of my CMS code management stuff.
I'm not sure how do-able this is though. CVS is nowhere near as clean as 
CMS and does not have an API so much as blamange. Seriously I think it might
be as easy to build something from scratch arround Oracle, HTTP and some of the
neat bit of CVS rather than try yet another bolt on.


Phill



eturn-Path: eric@spyglass.com 
Return-Path: <eric@spyglass.com>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA26971; Wed, 13 Jul 1994 20:25:17 +0200
Received: from spyglass.com by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA16868; Wed, 13 Jul 1994 20:25:31 +0200
Received: by spyglass.com (5.57/3.1.090690-Spyglass)
	id AA10194; Wed, 13 Jul 94 13:26:11 -0500
Received: by hook.spyglass.com (5.57/3.1.090690-Spyglass)
	id AA10464; Wed, 13 Jul 94 13:26:08 -0500
Message-Id: <9407131826.AA10464@hook.spyglass.com>
X-Sender: eric@hook
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Date: Wed, 13 Jul 1994 13:25:53 -0600
To: hallam@alws.cern.ch
From: eric@spyglass.com (Eric W. Sink)
Subject: Re: libwww
Cc: www-lib@www0.cern.ch
content-length: 851


[ All of Phillip's responses to my suggestions deleted. ]

>Hmm... Dosen't look Tooooo bad...

Well, that went over better than I thought!  Now I guess I feel
a certain liberty to be unreasonable. :-)

Can we get rid of all those ARGS3 K&R compatibility macros?
This is purely a style thing.  The only real difficulty they
cause is that editors which parse C syntax in looking for functions
can't grok the funky macros.  But how important is it REALLY that
libwww support non-ANSI C compilers?

Glad to get that off my chest.  If you agree to this, I'll be
back asking to re-indent the code with 4-space tabs! :-)

Eric W. Sink, Software Engineer --  eric@spyglass.com 217-355-6000 ext 237
All opinions expressed are mine, and may not be those of my employer.
        "Only academic people put cheese in their pocket."
            -SW, 24 May 1994


eturn-Path: connolly@hal.com 
Return-Path: <connolly@hal.com>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA08081; Wed, 13 Jul 1994 20:56:45 +0200
Received: from hal.hal.COM by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA19981; Wed, 13 Jul 1994 20:56:56 +0200
Received: from ulua.hal.com by hal.com (4.1/SMI-4.1.1)
	id AA04685; Wed, 13 Jul 94 11:56:36 PDT
Received: from localhost by ulua.hal.com (4.1/SMI-4.1.2)
	id AA17472; Wed, 13 Jul 94 13:57:02 CDT
Message-Id: <9407131857.AA17472@ulua.hal.com>
Cc: Multiple recipients of list <www-lib@www0.cern.ch>
Subject: Agree: Require ANSI C for development [Was: libwww ]
In-Reply-To: Your message of "Wed, 13 Jul 1994 20:29:03 +0200."
             <9407131826.AA10464@hook.spyglass.com> 
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Id: <17470.774125821.1@ulua>
Date: Wed, 13 Jul 1994 13:57:02 -0500
From: "Daniel W. Connolly" <connolly@hal.com>
content-length: 3091

In message <9407131826.AA10464@hook.spyglass.com>, Eric W. Sink writes:
>
>Can we get rid of all those ARGS3 K&R compatibility macros?
>This is purely a style thing.  The only real difficulty they
>cause is that editors which parse C syntax in looking for functions
>can't grok the funky macros.  But how important is it REALLY that
>libwww support non-ANSI C compilers?

Amen and Halleluia!

Look at it this way: we've got different classes of consumers for
libWWW:

	* lib developers: folks like ourselves who can contribute to the
	  library. I think it's safe to assume that
	  anybody in this class has an ANSI C development environment.

	* builders: folks who just want to compile the library and apps
	  based on it on various platforms and provide binary
	  distributions of the library. These folks need an ANSI C
	  compiler. (And PGP or RIPEM to sign the binary distribution
	  so we can prevent viruses...)

	* app developers: folks who just want to use the library to
	  build specialized apps. These folks shouldn't be required
	  to have an ANSI C compiler. If they can get a binary library
	  distribuition, the public header files should work
	  even on broken compilers.

	* administrators: folks that support users by installing apps
	  and documentation, answering questions, etc.
	  These folks shouldn't need a compiler at all.

	* users: folks that want to surf and post. These folks don't
	  know what a compiler is.

The explosion in user base for the web is a result of the fact that
you don't have to be able to play the typical net-software game
of:
	* download the source
	* edit the makefiles
	* fix the dirent/direct BSD/SYSV stuff for your platform
	* build it
	* find a few bugs
	* fix them
	* install it in ~/bin

just to use the dang thing. For a given "product", there's a hierarchy:
zillions of users, supported by thousands of administrators who get
software from hundreds of sites, and that software is contributed by ten or
twenty developers.

Not only should we require an ANSI C compiler for development, but we
should write squeaky-clean ANSI C code, except for modules that need
POSIX features, in which case we should write squeaky-clean POSIX code.

For the non-standard interfaces (networking, etc.) we should use function
pointers and allow folks to substitute work-alikes WITHOUT RECOMPILING
THE LIBRARY!

I spent a weekend cleaning up the 2.15 header files, and isolated all
the platform specific stuff. It wasn't that hard. We just need to
synchronize that sort of thing because it touches all the files,
practically.

For cases where the platform doesn't directly support ANSI C semantics,
we supplement the runtime for that platform with something that does
what the spec says. For example, in the case of malloc() causing memory
leaks in MS Windows, we provide a malloc() that, for example registers an
at_exit() function and free()s everything when the program exits.

>Glad to get that off my chest.  If you agree to this, I'll be
>back asking to re-indent the code with 4-space tabs! :-)

I'm in favor of this too, by the way!

Dan

eturn-Path: hallam@alws.cern.ch 
Return-Path: <hallam@alws.cern.ch>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA14455; Wed, 13 Jul 1994 21:18:01 +0200
Received: from ALWS.DECnet MAIL11D_V3 by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA22698; Wed, 13 Jul 1994 21:17:51 +0200
Date: Wed, 13 Jul 1994 21:17:50 +0200
Message-Id: <9407131917.AA22698@dxmint.cern.ch>
From: hallam@alws.cern.ch
X-Vms-To: DXMINT::www-lib@www0.cern.ch
X-Vms-Cc: HALLAM
Subject: RE: Agree: Require ANSI C for development [Was: libwww ]
X-Mail11-Ostype: VAX/VMS
Apparently-To: <www-lib@www0.cern.ch>
content-length: 1726

OK, If its beat up on ARGSx and PARAMS time...


For Shen I first thought `yuk!' then I had second thoughts `yuk!'.

Then I wrote a C preprocessor that takes ANSI C and produces the ARGS
stuff and HTML headers and a few other bits. I'm a big fan of having
the headers generated from the C source. This means that headers for
structures are kept separately - but  I always do this anyway because I
use data models such as Gvdel to build C strutures etc from a more
abstract form, allow attachment of metadata etc etc...

On the MALLOC side. Can we do the job properly? Like have macros where the
TYPE of the malloced item is declared:-

bytes = MALLOC (char, 32)

Or better

STATUS = MALLOC (char, 32, &bytes)

This means that you can have metadata attachments which then means that 
hypertext coredumps can be produced... dead kool...


I agree with Dan on the forcing WINDOZE to be proper... I always use the
macros BEGIN and END in my proceedures to force handling of status info
transparently. So if I'm on VMS I can get proper VMS error codes and if I'm on 
UNIX they look well yukky. But you can then play games with the END 
macro... Stick some sort of label in and a return then becomes a jump to the
label and return... Can force a lot of cleanups that way.... 


I also use macros to assert PRE and POST conditions....

PRE (WWW_an_error_code, x!=NULL);
PRE ....

>Not only should we require an ANSI C compiler for development, but we
>should write squeaky-clean ANSI C code, except for modules that need
>POSIX features, in which case we should write squeaky-clean POSIX code.

I don't like using gcc for development for this very reason.. I get many more 
error messages from the DEC-C compiler on VMS...



Phill

eturn-Path: eric@spyglass.com 
Return-Path: <eric@spyglass.com>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA16165; Wed, 13 Jul 1994 21:23:24 +0200
Received: from spyglass.com by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA23312; Wed, 13 Jul 1994 21:23:33 +0200
Received: by spyglass.com (5.57/3.1.090690-Spyglass)
	id AA10382; Wed, 13 Jul 94 14:24:08 -0500
Received: by hook.spyglass.com (5.57/3.1.090690-Spyglass)
	id AA10774; Wed, 13 Jul 94 14:24:06 -0500
Message-Id: <9407131924.AA10774@hook.spyglass.com>
X-Sender: eric@hook
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Date: Wed, 13 Jul 1994 14:23:50 -0600
To: www-lib@www0.cern.ch
From: eric@spyglass.com (Eric W. Sink)
Subject: Re: Agree: Require ANSI C for development [Was: libwww ]
content-length: 2758


>Amen and Halleluia!

Wow!  Dan's religious streak has surfaced!

>Look at it this way: we've got different classes of consumers for
>libWWW:
>
>        * app developers: folks who just want to use the library to
>          build specialized apps. These folks shouldn't be required
>          to have an ANSI C compiler. If they can get a binary library
>          distribuition, the public header files should work
>          even on broken compilers.

I think they *should* be required to have an ANSI C compiler.  They
probably won't survive with just a binary distribution.  Besides, ansi2knr
converters are plentiful.  This not an important point, I just wanted
to disagree with something. :-)

>Not only should we require an ANSI C compiler for development, but we
>should write squeaky-clean ANSI C code, except for modules that need
>POSIX features, in which case we should write squeaky-clean POSIX code.

Amen!  Preach Brother!  I've been kind of fanatical in my libwww work about
making sure it compiles under Visual C's warning level 3 with no warnings.
Our hacked library also compiles under CodeWarrior on the Mac, which is kind of
shocking.  CodeWarrior gives compiler warnings if the programmer is sitting
with bad posture.

>For the non-standard interfaces (networking, etc.) we should use function
>pointers and allow folks to substitute work-alikes WITHOUT RECOMPILING
>THE LIBRARY!

Oh, I don't know that we need to go that far, but the spirit of your
stance is consistent with mine.

>For cases where the platform doesn't directly support ANSI C semantics,
>we supplement the runtime for that platform with something that does
>what the spec says. For example, in the case of malloc() causing memory
>leaks in MS Windows, we provide a malloc() that, for example registers an
>at_exit() function and free()s everything when the program exits.

Disagree.  Change the name of the function.  If I see malloc() in code,
I assume that it does the exact same thing ANSI says it does.  Nothing
more.  If I see W3_MALLOC in code, I assume that someone needed a malloc()
which did a little more than ANSI says it does, so they wrote one.  Since
that function name is all upper case, I also assume it's a preprocessor
macro, and that on some platforms, it may just define back to malloc().

>>Glad to get that off my chest.  If you agree to this, I'll be
>>back asking to re-indent the code with 4-space tabs! :-)
>
>I'm in favor of this too, by the way!

Good.  It's a movement that's gathering momentum. :-)


Eric W. Sink, Software Engineer --  eric@spyglass.com 217-355-6000 ext 237
All opinions expressed are mine, and may not be those of my employer.
        "Only academic people put cheese in their pocket."
            -SW, 24 May 1994


eturn-Path: connolly@hal.com 
Return-Path: <connolly@hal.com>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA17755; Wed, 13 Jul 1994 21:28:32 +0200
Received: from hal.hal.COM by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA24221; Wed, 13 Jul 1994 21:28:29 +0200
Received: from ulua.hal.com by hal.com (4.1/SMI-4.1.1)
	id AA05328; Wed, 13 Jul 94 12:28:08 PDT
Received: from localhost by ulua.hal.com (4.1/SMI-4.1.2)
	id AA17526; Wed, 13 Jul 94 14:28:33 CDT
Message-Id: <9407131928.AA17526@ulua.hal.com>
To: hallam@alws.cern.ch
Cc: Multiple recipients of list <www-lib@www0.cern.ch>, connolly@hal.com
Subject: Re: Agree: Require ANSI C for development [Was: libwww ] 
In-Reply-To: Your message of "Wed, 13 Jul 1994 21:19:02 +0200."
             <9407131917.AA22698@dxmint.cern.ch> 
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Id: <17524.774127712.1@ulua>
Date: Wed, 13 Jul 1994 14:28:33 -0500
From: "Daniel W. Connolly" <connolly@hal.com>
content-length: 836

In message <9407131917.AA22698@dxmint.cern.ch>, hallam@alws.cern.ch writes:
>
>>Not only should we require an ANSI C compiler for development, but we
>>should write squeaky-clean ANSI C code, except for modules that need
>>POSIX features, in which case we should write squeaky-clean POSIX code.
>
>I don't like using gcc for development for this very reason.. I get many more 
>error messages from the DEC-C compiler on VMS...

To get gcc to act as a strictly conforming ANSI C compiler, invoke it as:

	gcc -ansi -pedantic-errors

My personal favorite is:

	gcc -ansi -pedantic-errors -O -g -Wall

The combination of -O and -Wall will produce some very interesting warnings:
	* variable used before initialized
	* variable not used at all
	* implicitly declared function (i.e. you forgot <stdio.h> or some such)
	* pointer abuse

Dan

eturn-Path: eric@spyglass.com 
Return-Path: <eric@spyglass.com>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA20590; Wed, 13 Jul 1994 21:39:43 +0200
Received: from spyglass.com by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA25661; Wed, 13 Jul 1994 21:39:09 +0200
Received: by spyglass.com (5.57/3.1.090690-Spyglass)
	id AA10453; Wed, 13 Jul 94 14:39:55 -0500
Received: by hook.spyglass.com (5.57/3.1.090690-Spyglass)
	id AA10831; Wed, 13 Jul 94 14:39:53 -0500
Message-Id: <9407131939.AA10831@hook.spyglass.com>
X-Sender: eric@hook
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Date: Wed, 13 Jul 1994 14:39:37 -0600
To: www-lib@www0.cern.ch
From: eric@spyglass.com (Eric W. Sink)
Subject: RE: Agree: Require ANSI C for development [Was: libwww ]
content-length: 3206

>OK, If its beat up on ARGSx and PARAMS time...
>
>
>For Shen I first thought `yuk!' then I had second thoughts `yuk!'.
>
>Then I wrote a C preprocessor that takes ANSI C and produces the ARGS
>stuff and HTML headers and a few other bits. I'm a big fan of having
>the headers generated from the C source. This means that headers for
>structures are kept separately - but  I always do this anyway because I
>use data models such as Gvdel to build C strutures etc from a more
>abstract form, allow attachment of metadata etc etc...

Oooh, I don't think I'd *really* want my headers generated from
my C code.  Though, I can't say that I've tried it.

>On the MALLOC side. Can we do the job properly? Like have macros where the
>TYPE of the malloced item is declared:-
>
>bytes = MALLOC (char, 32)
>
>Or better
>
>STATUS = MALLOC (char, 32, &bytes)
>
>This means that you can have metadata attachments which then means that
>hypertext coredumps can be produced... dead kool...

I'd rather keep this real simple.  I never call malloc without a cast anyway.
Just a nice simple, obvious #define symbol which is #defined to malloc under
UNIX would suit me.  I'll check its result for NULL just like always.

>I agree with Dan on the forcing WINDOZE to be proper... I always use the
>macros BEGIN and END in my proceedures to force handling of status info
>transparently. So if I'm on VMS I can get proper VMS error codes and if I'm on
>UNIX they look well yukky. But you can then play games with the END
>macro... Stick some sort of label in and a return then becomes a jump to the
>label and return... Can force a lot of cleanups that way....

Well, first of all it's not Windows, it's the Mac.  Windows has a fine
malloc implementation, and I use it all the time.  The fact that Windows
doesn't free memory on process exit is not the fault of malloc, and
malloc is not the place to fix it.  Just free whatever you allocate.

The problem is the Mac.  There is no memory allocation function in any
Mac compiler which is reliable and is also named malloc().  I don't
want to write one.  I also don't any function I write to have the same
name as an ANSI function unless I got to the effort to make *very sure*
that my implementation is ANSI compliant.  I don't have the time or
motivation to do that.  #define W3_MALLOC solves the whole problem.
If it creates other problems, then maybe we can find another solution,
but I'd sure like it to be a simple one.

>>Not only should we require an ANSI C compiler for development, but we
>>should write squeaky-clean ANSI C code, except for modules that need
>>POSIX features, in which case we should write squeaky-clean POSIX code.
>
>I don't like using gcc for development for this very reason.. I get many more
>error messages from the DEC-C compiler on VMS...

gcc -pedantic -whatever_the_option_is_to_turn_on_ALL_the_warnings

It's almost like using lint.  I've never used DEC's VMS C compiler, but
their Ultrix compiler is awful.

Eric W. Sink, Software Engineer --  eric@spyglass.com 217-355-6000 ext 237
All opinions expressed are mine, and may not be those of my employer.
        "Only academic people put cheese in their pocket."
            -SW, 24 May 1994


eturn-Path: hallam@alws.cern.ch 
Return-Path: <hallam@alws.cern.ch>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA26404; Wed, 13 Jul 1994 22:00:11 +0200
Received: from ALWS.DECnet MAIL11D_V3 by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA29137; Wed, 13 Jul 1994 22:00:28 +0200
Date: Wed, 13 Jul 1994 22:00:28 +0200
Message-Id: <9407132000.AA29137@dxmint.cern.ch>
From: hallam@alws.cern.ch
X-Vms-To: DXMINT::www-lib@www0.cern.ch
X-Vms-Cc: HALLAM
Subject: RE: Agree: Require ANSI C for development [Was: libwww ]
X-Mail11-Ostype: VAX/VMS
Apparently-To: <www-lib@www0.cern.ch>
content-length: 1596

It's not a question of keeping things simple, you can do an awfull
lot more if you know the type of the thing you are mallocing. 

For example 

- intelligent memory allocation/ release, reusing old blocks from a heap.
- interface to data model with automatic initialisation
- use metadata to automatically free all dependent structures,
	do pretty printing etc.

AND you can also automate the cast!

#define W3_MALLOC(type,number) (type *) malloc (sizeof(type) * number)

So in code you get:-

new = W3_MALLOC (char, 1024);
new = (char*) malloc (sizeof(char)*1024);


So you remove the possibility of an error mistyping char. Plus if you have 
a super duper memory allocator you can still use it!


On the status value:-

The problem with returning NULL is that you can't know WHY the malloc 
failed. It may be because there is no memory left at all in the system 
or because a prearranged limit has been reached which could be increased
at user discression - eg if you have a cache system where storing the data
is optional...

Depending on the situation the user may be able to complete the operation by
flushing the cache or there may simply not be enough memory. Compressing
error return codes to binary success/fail values was a very bad idea
in UNIX.


In fact I would like to suggest that we move to a system where EVERY routine
returns a status code value. This could either be a simple integer or a
pointer to a structure (more macros, should be a choice!). This means that
you always know how to expect the status code and not have a mish mash of
different status conventions.


Phill
eturn-Path: doslynx@falcon.cc.ukans.edu 
Return-Path: <doslynx@falcon.cc.ukans.edu>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA27745; Wed, 13 Jul 1994 22:06:09 +0200
Received: from kuhub.cc.ukans.edu by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA29675; Wed, 13 Jul 1994 22:06:24 +0200
Received: from falcon.cc.ukans.edu by KUHUB.CC.UKANS.EDU (PMDF V4.3-8 #5489)
 id <01HENUA7Q5YO8WYJ2B@KUHUB.CC.UKANS.EDU>; Wed, 13 Jul 1994 15:05:57 CDT
Received: by falcon.cc.ukans.edu; id AA23891; Wed, 13 Jul 1994 15:06:04 -0500
Date: Wed, 13 Jul 1994 15:06:04 -0500 (CDT)
From: Garrett Arch Blythe <doslynx@falcon.cc.ukans.edu>
Subject: Re: libwww
In-Reply-To: <9407131950.AA10886@hook.spyglass.com>
To: www-lib@www0.cern.ch
Message-Id: <Pine.3.89.9407131437.B21941-0100000@falcon.cc.ukans.edu>
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Content-Transfer-Encoding: 7BIT
content-length: 2142

On Wed, 13 Jul 1994, Eric W. Sink wrote:
> Hi Garrett, thanks for jumping into this mess.  It does not appear that
> your message went to the list, BTW.
> 
> >NO!  I definately disagree with the first of these requests.  There are
> >many, many machines out there with K&R, and dumping that will bring me
> >such pain as to increase Tylenol stock by 15 points.  Hell, you might as
> >well ask if it should be done in C++ (see lynx-dev).
> 
> Hmmm.  That's a bummer.  About how many Lynx users do you have?  Some
> free software packages provide one of those ANSI <-> K&R converters
> just for this purpose.  Any chance that would work for you?

You're right about this.  It'll have to be up to the users to have this 
converter though as I can't be providing it for every possible platform 
that Lynx will eventually see the light of day on.

One thing I've noticed, is that the ARGS macros don't take a pointer to a 
function very well, or a varaible number of arguments.

Someone please shoot me, but what about the very old machines running the 
line mode browser, like IBM/370's or whatever.  Do they have such a 
converter?  They will not like being left behind.

> >As for the 4 space tabs....  I hate it.  Truly.  I want a normal tab
> >character left up to my editor for interpretation.
> 
> I may not have been clear.  Indent the code such that every indent
> is exactly one tab.  A configurable editor can choose to display
> those tabs expanded to 8 spaces or 3 or 4 or whatever.  But, for
> places where things need to line up with the thing above, and
> tabs are used for positioning, then you have to make some assumption
> about what tabs will expand to if you want it ever to line up.
> In this situation, assume that a tab will expand to 4 spaces.

Okay, looks like there is some headway into defining the conventions that 
we will be modifying the libwww in.

Garrett.

Trodden Soil

I am trodden soil.
Dust covers my face.
Soles crush my nature
Revealing a hard empty space.

Garrett Arch Blythe  (913)864-0436
User Services Student Programmer/Consultant
University of Kansas Computer Center
<doslynx@falcon.cc.ukans.edu>

eturn-Path: eric@spyglass.com 
Return-Path: <eric@spyglass.com>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA06364; Wed, 13 Jul 1994 22:36:14 +0200
Received: from spyglass.com by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA01784; Wed, 13 Jul 1994 22:36:25 +0200
Received: by spyglass.com (5.57/3.1.090690-Spyglass)
	id AA10757; Wed, 13 Jul 94 15:37:01 -0500
Received: by hook.spyglass.com (5.57/3.1.090690-Spyglass)
	id AA11143; Wed, 13 Jul 94 15:36:58 -0500
Message-Id: <9407132036.AA11143@hook.spyglass.com>
X-Sender: eric@hook
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Date: Wed, 13 Jul 1994 15:36:43 -0600
To: www-lib@www0.cern.ch
From: eric@spyglass.com (Eric W. Sink)
Subject: RE: Agree: Require ANSI C for development [Was: libwww ]
content-length: 1118

>It's not a question of keeping things simple,

Of course it is!

>you can do an awfull
>lot more if you know the type of the thing you are mallocing.

Granted, but I don't want to do an awful lot more.

[ rather convincing argument deleted ]

>In fact I would like to suggest that we move to a system where EVERY routine
>returns a status code value. This could either be a simple integer or a
>pointer to a structure (more macros, should be a choice!). This means that
>you always know how to expect the status code and not have a mish mash of
>different status conventions.

OK, I'll accept your idea.  After all, I can't expect that others will
make *all* the compromises.  In general, returning status code values is
a Good Idea which is kind of hard to argue with, and your suggested W3_MALLOC
macro which has the type built in looks quite livable.

But it's not "simple" :-)

Eric W. Sink, Software Engineer --  eric@spyglass.com 217-355-6000 ext 237
All opinions expressed are mine, and may not be those of my employer.
        "Only academic people put cheese in their pocket."
            -SW, 24 May 1994


eturn-Path: connolly@hal.com 
Return-Path: <connolly@hal.com>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA08171; Wed, 13 Jul 1994 22:41:10 +0200
Received: from hal.hal.COM by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA02188; Wed, 13 Jul 1994 22:41:22 +0200
Received: from ulua.hal.com by hal.com (4.1/SMI-4.1.1)
	id AA07628; Wed, 13 Jul 94 13:41:00 PDT
Received: from localhost by ulua.hal.com (4.1/SMI-4.1.2)
	id AA17568; Wed, 13 Jul 94 15:41:25 CDT
Message-Id: <9407132041.AA17568@ulua.hal.com>
To: doslynx@falcon.cc.ukans.edu
Cc: Multiple recipients of list <www-lib@www0.cern.ch>, connolly@hal.com
Subject: Replace K&R source dist. with binary dist [Was: libwww ]
In-Reply-To: Your message of "Wed, 13 Jul 1994 22:23:40 +0200."
             <Pine.3.89.9407131437.B21941-0100000@falcon.cc.ukans.edu> 
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Id: <17566.774132084.1@ulua>
Date: Wed, 13 Jul 1994 15:41:25 -0500
From: "Daniel W. Connolly" <connolly@hal.com>
content-length: 1071

In message <Pine.3.89.9407131437.B21941-0100000@falcon.cc.ukans.edu>, Garrett Arch Blythe write
s:
>On Wed, 13 Jul 1994, Eric W. Sink wrote:
>> Hi Garrett, thanks for jumping into this mess.  It does not appear that
>> your message went to the list, BTW.
>> 
>> >NO!  I definately disagree with the first of these requests.  There are
>> >many, many machines out there with K&R, and dumping that will bring me
>> >such pain as to increase Tylenol stock by 15 points.  Hell, you might as
>> >well ask if it should be done in C++ (see lynx-dev).

But how many of the users on these machines with K&R only are developing
libwww software?

The rest of the users are just that -- users, and they'll be much happier
with a compiled binary distribution than a K&R C source distribution,
I wager.

The big issue with binary distribution is viruses. To alleviate this
problem, I suggest that anybody who provides a binary distribution
should use MD5 checksums -- signed with a public key, if possible --
to allow end users to detect corrupted/forged/modified distributions.

Dan

eturn-Path: connolly@hal.com 
Return-Path: <connolly@hal.com>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA05603; Thu, 14 Jul 1994 04:22:07 +0200
Received: from hal.hal.COM by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA27469; Thu, 14 Jul 1994 04:22:24 +0200
Received: from ulua.hal.com by hal.com (4.1/SMI-4.1.1)
	id AA15844; Wed, 13 Jul 94 19:22:02 PDT
Received: from localhost by ulua.hal.com (4.1/SMI-4.1.2)
	id AA17873; Wed, 13 Jul 94 21:22:27 CDT
Message-Id: <9407140222.AA17873@ulua.hal.com>
To: Garrett Arch Blythe <doslynx@falcon.cc.ukans.edu>
Cc: www-lib@www0.cern.ch
Cc: "Daniel W. Connolly" <connolly@hal.com>, connolly@hal.com
Subject: Re: Replace K&R source dist. with binary dist [Was: libwww ] 
In-Reply-To: Your message of "Wed, 13 Jul 1994 16:11:15 CDT."
             <Pine.3.89.9407131652.B25291-0100000@falcon.cc.ukans.edu> 
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Id: <17871.774152546.1@ulua>
Date: Wed, 13 Jul 1994 21:22:27 -0500
From: "Daniel W. Connolly" <connolly@hal.com>
content-length: 2017

[Again, your message didn't go to the list...]

In message <Pine.3.89.9407131652.B25291-0100000@falcon.cc.ukans.edu>, Garrett Arch Blythe write
s:
>On Wed, 13 Jul 1994, Daniel W. Connolly wrote:
>> 
>> The rest of the users are just that -- users, and they'll be much happier
>> with a compiled binary distribution than a K&R C source distribution,
>> I wager.
>
>This is not going to work.  Lynx as it stands has about a kazillion 
>different little ifdefs that Lou left me with, so each compiled binary is 
>actually different than perhaps the one that I use for debugging which 
>has everything turned on.  The reason why the configuration is at 
>compile time is to leave out some security problems that some people 
>simply don't want to deal with, ever.

Shame on Lou. #ifdef's are evil. But certainly 80-95% of the world's
lynx users are using the same combination of #ifdefs anyway.

I expect that two or three compiled versions of the code would suffice
for the vast majority of the user population: one with all the gaping
security holes wide open, one with all of them turned off, and one
somewhere in between.

>Precompiled binaries suck.  They don't give the person installing full 
>control of the options provided (for security) and they can't set it up 
>specifically for their system.

These are not insurmountable problems. A well designed product has
all the installation-time and run-time switches needed by its customer
base.

Folks who aren't willing to wait for supported configurations can
get the source and build it, of course, but they'll need an ANSI C
compiler.

I think this is the best way to provide quality software to the
largest audience. Witness the explosion of Linux.

>I also don't have access to all the machines that I would need to provide 
>a Lynx binary for, which is every UNIX/VMS/DOS box on the globe.

But: do you have access to _one_ person using each platform who has an
ANSI C compiler and would be willing to provide binary distribtions?
That's all you need.

Dan

eturn-Path: doslynx@falcon.cc.ukans.edu 
Return-Path: <doslynx@falcon.cc.ukans.edu>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA07729; Thu, 14 Jul 1994 04:53:20 +0200
Received: from kuhub.cc.ukans.edu by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA28692; Thu, 14 Jul 1994 04:53:37 +0200
Received: from falcon.cc.ukans.edu by KUHUB.CC.UKANS.EDU (PMDF V4.3-8 #5489)
 id <01HEO8I3XLZ48WYRZT@KUHUB.CC.UKANS.EDU>; Wed, 13 Jul 1994 21:53:11 CDT
Received: by falcon.cc.ukans.edu; id AA08069; Wed, 13 Jul 1994 21:53:18 -0500
Date: Wed, 13 Jul 1994 21:53:17 -0500 (CDT)
From: Garrett Arch Blythe <doslynx@falcon.cc.ukans.edu>
Subject: Re: Replace K&R source dist. with binary dist [Was: libwww ]
In-Reply-To: <9407140222.AA17873@ulua.hal.com>
To: "Daniel W. Connolly" <connolly@hal.com>
Cc: Multiple recipients of list <www-lib@www0.cern.ch>,
        lynx-dev@ukanaix.cc.ukans.edu
Message-Id: <Pine.3.89.9407132136.B7338-0100000@falcon.cc.ukans.edu>
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Content-Transfer-Encoding: 7BIT
content-length: 3979

Hi all,
	Lynx-devers, this is for you also, we're talking about the 
language choice (full ANSI C or continue supporting K&R C too) to be used in 
libwww (CERN's library for the WWW which Lynx uses); to be used in light of 
our C++ discussion (RE: lynx future.....)


On Thu, 14 Jul 1994, Daniel W. Connolly wrote:
> In message <Pine.3.89.9407131652.B25291-0100000@falcon.cc.ukans.edu>, Garrett Arch Blythe write
> s:
> >On Wed, 13 Jul 1994, Daniel W. Connolly wrote:
> >> 
> >> The rest of the users are just that -- users, and they'll be much happier
> >> with a compiled binary distribution than a K&R C source distribution,
> >> I wager.
> >
> >This is not going to work.  Lynx as it stands has about a kazillion 
> >different little ifdefs that Lou left me with, so each compiled binary is 
> >actually different than perhaps the one that I use for debugging which 
> >has everything turned on.  The reason why the configuration is at 
> >compile time is to leave out some security problems that some people 
> >simply don't want to deal with, ever.
> 
> Shame on Lou. #ifdef's are evil. But certainly 80-95% of the world's
> lynx users are using the same combination of #ifdefs anyway.

Right.  I'd assume that they are.  The vanilla build is most generally used.

> I expect that two or three compiled versions of the code would suffice
> for the vast majority of the user population: one with all the gaping
> security holes wide open, one with all of them turned off, and one
> somewhere in between.

Probably; the details are not worth explaining.  Your point is obvious.

> >Precompiled binaries suck.  They don't give the person installing full 
> >control of the options provided (for security) and they can't set it up 
> >specifically for their system.
> 
> These are not insurmountable problems. A well designed product has
> all the installation-time and run-time switches needed by its customer
> base.

Yes, they aren't insurmountable.  What can I say?  I don't want ANSI C to 
cause my users less grief; as I said some only have K&R C; Many of 
the people I talk to daily are interested in the source and not the 
precompiled binary.  This is not such a bad thing.

We as developers want ANSI C to clean up the code/make it easier to read;
a very very good thing. 

I want to hear a clear decision about wether or we should be using ANSI C
and to stop supporting K&R C in libwww. I want to know that libwww is
going to be in ANSI C so that I can stop worrying about the ARGS macros
and start coding ANSI C myself. 


And mainly for the Lynx-devers:

> Folks who aren't willing to wait for supported configurations can
> get the source and build it, of course, but they'll need an ANSI C
> compiler.
> 
> I think this is the best way to provide quality software to the
> largest audience. Witness the explosion of Linux.
> 
> >I also don't have access to all the machines that I would need to provide 
> >a Lynx binary for, which is every UNIX/VMS/DOS box on the globe.
> 
> But: do you have access to _one_ person using each platform who has an
> ANSI C compiler and would be willing to provide binary distribtions?
> That's all you need.

Okay, truth be known I want parts of Lynx in C++, as soon as I hear the
decision you are going to pass down regarding ANSI C, may be a big factor
on my decision. 

If I have to go ahead and make contacts with people to make me pre 
compiled binaries just for ANSI C, why don't I just go all the way and 
find people to make C++ binaries for me for the people without a C++ 
compiler?

Many of my users don't have ANSI C; many more of them also don't have C++.  
But I am sure one of them has a C++ compiler for every different 
OS/Version.  Any takers?

Garrett.


Trodden Soil

I am trodden soil.
Dust covers my face.
Soles crush my nature
Revealing a hard empty space.

Garrett Arch Blythe  (913)864-0436
User Services Student Programmer/Consultant
University of Kansas Computer Center
<doslynx@falcon.cc.ukans.edu>

eturn-Path: hallam@alws.cern.ch 
Return-Path: <hallam@alws.cern.ch>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA06957; Thu, 14 Jul 1994 10:34:41 +0200
Received: from ALWS.DECnet MAIL11D_V3 by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA22729; Thu, 14 Jul 1994 10:34:57 +0200
Date: Thu, 14 Jul 1994 10:34:57 +0200
Message-Id: <9407140834.AA22729@dxmint.cern.ch>
From: hallam@alws.cern.ch
X-Vms-To: DXMINT::www-lib@www0.cern.ch
X-Vms-Cc: HALLAM
Subject: RE: Replace K&R source dist. with binary dist [Was: libwww ]
X-Mail11-Ostype: VAX/VMS
Apparently-To: <www-lib@www0.cern.ch>
content-length: 773

>The big issue with binary distribution is viruses. To alleviate this
>problem, I suggest that anybody who provides a binary distribution
>should use MD5 checksums -- signed with a public key, if possible --
>to allow end users to detect corrupted/forged/modified distributions.

Would people beleive me if I promised this Real Soon Now (TM)?

Just having a bit of hassle rewriting some stuff to make the source code a bit 
manageable but digital signatures for binary distribution should be here soon.


Question is how to declare them? I first thought that the best way would be
to simply MD5 the file in transit. This is a bad move since any tampering
will take place before distribution...

Then I thought of adding it into the URC file... better perhaps... ?


Phill

eturn-Path: hallam@alws.cern.ch 
Return-Path: <hallam@alws.cern.ch>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA08170; Thu, 14 Jul 1994 10:49:35 +0200
Received: from ALWS.DECnet MAIL11D_V3 by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA24401; Thu, 14 Jul 1994 10:49:51 +0200
Date: Thu, 14 Jul 1994 10:49:51 +0200
Message-Id: <9407140849.AA24401@dxmint.cern.ch>
From: hallam@alws.cern.ch
X-Vms-To: DXMINT::www-lib@www0.cern.ch
X-Vms-Cc: HALLAM
Subject: RE: Agree: Require ANSI C for development [Was: libwww ]
X-Mail11-Ostype: VAX/VMS
Apparently-To: <www-lib@www0.cern.ch>
content-length: 1169

>
>OK, I'll accept your idea.  After all, I can't expect that others will
>make *all* the compromises.  In general, returning status code values is
>a Good Idea which is kind of hard to argue with, and your suggested W3_MALLOC
>macro which has the type built in looks quite livable.
>
>But it's not "simple" :-)

There is generally a choice with simplicity. Over simplify one area and you
compilicate others. Having a common standard for the API error codes is in
my view simpler because then you never have to think about how the errors
are returned.

It means that you cant write code that is quite as compact. But this tends to 
work out an advantage. The coder can't bury a proceedure that can possibly fail 
in the middle of a function and end up with something that can't handle errors. 

`Compact' code is one of those things that everyone thinks is great when they do 
it themselves but less great when they are coping with other peoples 
obfusticated C. Its an irregular verb:-

I compact code
You oddly code
He obfusticates code
....

There are a number of C constructs I would really like to kill kill kill....
Mumble..  Anyone ever tried Occam2 ?


Phill.

eturn-Path: frystyk@ptsun00.cern.ch 
Return-Path: <frystyk@ptsun00.cern.ch>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA11253; Thu, 14 Jul 1994 11:17:22 +0200
Received: from ptsun00.cern.ch by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA28367; Thu, 14 Jul 1994 11:17:19 +0200
Received: from ptsun03.cern.ch by ptsun00.cern.ch (4.1/SMI-4.1)
	id AA03696; Thu, 14 Jul 94 11:16:03 +0200
From: frystyk@ptsun00.cern.ch (Henrik Frystyk Nielsen)
Received: by ptsun03.cern.ch (4.1/client-1.5)
	id AA21139; Thu, 14 Jul 94 11:16:03 +0200
Date: Thu, 14 Jul 94 11:16:03 +0200
Message-Id: <9407140916.AA21139@ptsun03.cern.ch>
To: www-lib@www0.cern.ch
Subject: Re: errors don't come back as Content-type: text/html
content-length: 1789

Hi

Excuse me for moving the discussion to this list, but as you are both
subscribed and here I can give some more specific information.

> >        Not sure if this was best.   One annoying feature of Mosaic
> >is that it presumes things are HTML unless otherwise explicitly labeled.
> >Retrieve any number of plain-text items that aren't clearly labeled,
> >and blech!   Runtogethernonsense.   Maybe there should be some fall-back
> >processing when there's no content-type record.   It shouldn't be a really
> >compilcated matter for Lynx, Mosaic, and friends to eyeball the first
> >line and look for the canonical "<html>" or other SGML-ish strings.
> >("<plaintext>" being a special case,  of course)   Maybe this feature
> >should be in the WWW Library code?

The version 2.16 of the library _does_ contain a `guess stream' module.
Look into the HTGuess.c module (or the description in HTGuess.html) and
you will find it.

This is currently used in the two functions that parses a input stream
from the network or a local file:

	HTParseFile()
	HTParseSocket()

> I agree.  I actually do not like Mosaic's behavior of assuming text/html
> unless told otherwise.  This gives lovely behavior when trying to access
> something like a file compressed with gzip.  XMosaic stuffs a bunch of
> gibberish onto the screen.
> 
> I think this has been discussed before.  Didn't it go like this? :
> 
>         1.  Obey the HTTP 1.0 content type
>         2.  Otherwise, check the suffix on the file and guess
>         3.  Otherwise, call HTSaveLocally (or equiv) to just put the
>                 file on local disk.

As I said on www-talk - I think it is a bad idea to treat unknown
content-types as text/html. The three steps above are the current
procedure.


-- cheers --

Henrik Frystyk
eturn-Path: doslynx@falcon.cc.ukans.edu 
Return-Path: <doslynx@falcon.cc.ukans.edu>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA14211; Thu, 14 Jul 1994 11:50:49 +0200
Received: from eunet.EU.net by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA02304; Thu, 14 Jul 1994 11:49:42 +0200
Received: from kuhub.cc.ukans.edu (kuhub.cc.ukans.edu [129.237.32.1]) by eunet.EU.net (8.6.8/8.6.4) with ESMTP id LAA08633 for <www-lib@www0.cern.ch>; Thu, 14 Jul 1994 11:48:48 +0200
Received: from falcon.cc.ukans.edu by KUHUB.CC.UKANS.EDU (PMDF V4.3-8 #5489)
 id <01HEOMYQEVO08WZ9V1@KUHUB.CC.UKANS.EDU>; Thu, 14 Jul 1994 04:47:27 CDT
Received: by falcon.cc.ukans.edu; id AA16921; Thu, 14 Jul 1994 04:47:34 -0500
Date: Thu, 14 Jul 1994 04:47:33 -0500 (CDT)
From: Garrett Arch Blythe <doslynx@falcon.cc.ukans.edu>
Subject: Re: errors don't come back as Content-type: text/html
In-Reply-To: <9407140916.AA21139@ptsun03.cern.ch>
Sender: Garrett Arch Blythe <doslynx@falcon.cc.ukans.edu>
To: www-lib@www0.cern.ch
Reply-To: Garrett Arch Blythe <doslynx@falcon.cc.ukans.edu>
Message-Id: <Pine.3.89.9407140431.C16575-0100000@falcon.cc.ukans.edu>
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; CHARSET=US-ASCII
Content-Transfer-Encoding: 7BIT
content-length: 498

Yo,
	Thanks for reminding me about HTGuess in the new stuff.  You've 
made my Lynx life a lot easier.
	Lynx is going to be using application/octet-stream on this final 
release before I begin the rewrite (and get to use HTGuess :).

Garrett.

Trodden Soil

I am trodden soil.
Dust covers my face.
Soles crush my nature
Revealing a hard empty space.

Garrett Arch Blythe  (913)864-0436
User Services Student Programmer/Consultant
University of Kansas Computer Center
<doslynx@falcon.cc.ukans.edu>


eturn-Path: doslynx@falcon.cc.ukans.edu 
Return-Path: <doslynx@falcon.cc.ukans.edu>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA06108; Thu, 14 Jul 1994 21:38:53 +0200
Received: from kuhub.cc.ukans.edu by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA18250; Thu, 14 Jul 1994 21:38:51 +0200
Received: from falcon.cc.ukans.edu by KUHUB.CC.UKANS.EDU (PMDF V4.3-8 #5489)
 id <01HEP7LJ07HC8ZECB7@KUHUB.CC.UKANS.EDU>; Thu, 14 Jul 1994 14:38:30 CDT
Received: by falcon.cc.ukans.edu; id AA04601; Thu, 14 Jul 1994 14:38:33 -0500
Date: Thu, 14 Jul 1994 14:38:33 -0500 (CDT)
From: Garrett Arch Blythe <doslynx@falcon.cc.ukans.edu>
Subject: Split libwww from client
To: www-lib@www0.cern.ch
Message-Id: <Pine.3.89.9407141408.A4043-0100000@falcon.cc.ukans.edu>
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Content-Transfer-Encoding: 7BIT
content-length: 1181

For those of us who distribute source,

I know this is going to be hard to swallow:

I don't think those of us that distribute source should be shipping our 
variant WWW libraries with it.
This pratice is probably the major reason at fault of the libwww 
fragmentation over clients.

Note that we all haven't hacked up the freeWAIS library that is 
supported; maybe because it's optional, maybe because we didn't go 
changing it to be specific to our client.

Adopting this policy would place the burden of providing client side 
hooks up to the libwww group.  Again, we come to this idea of system 
independent API, but now it is system/client independent API; this is 
already partially done via the GridText functions, but there are other 
places that a client likes to stick its nose in, too.

Not doing so will continue to support the idea that we can hack up our 
personal libwww just for our client.


Yours,
	Garrett.

Trodden Soil

I am trodden soil.
Dust covers my face.
Soles crush my nature
Revealing a hard empty space.

Garrett Arch Blythe  (913)864-0436
User Services Student Programmer/Consultant
University of Kansas Computer Center
<doslynx@falcon.cc.ukans.edu>

eturn-Path: doslynx@falcon.cc.ukans.edu 
Return-Path: <doslynx@falcon.cc.ukans.edu>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA03874; Mon, 18 Jul 1994 03:12:56 +0200
Received: from kuhub.cc.ukans.edu by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA07159; Mon, 18 Jul 1994 03:13:18 +0200
Received: from falcon.cc.ukans.edu by KUHUB.CC.UKANS.EDU (PMDF V4.3-8 #5489)
 id <01HETQ61IPVK8WW90B@KUHUB.CC.UKANS.EDU>; Sun, 17 Jul 1994 20:12:49 CDT
Received: by falcon.cc.ukans.edu; id AA14952; Sun, 17 Jul 1994 20:12:56 -0500
Date: Sun, 17 Jul 1994 20:12:56 -0500 (CDT)
From: Garrett Arch Blythe <doslynx@falcon.cc.ukans.edu>
Subject: Forms in libwww 3.0
To: www-lib@www0.cern.ch
Message-Id: <Pine.3.89.9407172058.B13157-0100000@falcon.cc.ukans.edu>
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Content-Transfer-Encoding: 7BIT
content-length: 725

Hi all,
	Looked at the infomation out at CERN regarding the status of 
libwww.  Perhaps I missed it, but I read nothing about forms or their 
implementation inside of the libwww.  I am wondering if there is a 
standard forms API inside of libwww, or if there will be in the future.  
Also, I was wondering what level of HTML will the libwww support (along 
the same lines).

	Are these things considered to be a client implementation?

Thanks for any replies,
	Garrett.

Trodden Soil

I am trodden soil.
Dust covers my face.
Soles crush my nature
Revealing a hard empty space.

Garrett Arch Blythe  (913)864-0436
User Services Student Programmer/Consultant
University of Kansas Computer Center
<doslynx@falcon.cc.ukans.edu>

eturn-Path: doslynx@falcon.cc.ukans.edu 
Return-Path: <doslynx@falcon.cc.ukans.edu>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA29852; Tue, 19 Jul 1994 04:05:53 +0200
Received: from kuhub.cc.ukans.edu by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA13090; Tue, 19 Jul 1994 04:05:53 +0200
Received: from falcon.cc.ukans.edu by KUHUB.CC.UKANS.EDU (PMDF V4.3-8 #5489)
 id <01HEV6AZNL8G8WXCA4@KUHUB.CC.UKANS.EDU>; Mon, 18 Jul 1994 21:05:44 CDT
Received: by falcon.cc.ukans.edu; id AA25579; Mon, 18 Jul 1994 21:05:50 -0500
Date: Mon, 18 Jul 1994 21:05:50 -0500 (CDT)
From: Garrett Arch Blythe <doslynx@falcon.cc.ukans.edu>
Subject: ANSI C
To: www-lib@www0.cern.ch
Message-Id: <Pine.3.89.9407182146.A25561-0100000@falcon.cc.ukans.edu>
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Content-Transfer-Encoding: 7BIT
content-length: 365

Was there ever a clear decision wether or not the libwww would be 
completely ANSI C in the future?

Garrett.

Trodden Soil

I am trodden soil.
Dust covers my face.
Soles crush my nature
Revealing a hard empty space.

Garrett Arch Blythe  (913)864-0436
User Services Student Programmer/Consultant
University of Kansas Computer Center
<doslynx@falcon.cc.ukans.edu>

eturn-Path: frystyk@ptsun00.cern.ch 
Return-Path: <frystyk@ptsun00.cern.ch>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA13632; Tue, 19 Jul 1994 16:56:04 +0200
Received: from ptsun00.cern.ch by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA26719; Tue, 19 Jul 1994 16:56:03 +0200
Received: from ptsun03.cern.ch by ptsun00.cern.ch (4.1/SMI-4.1)
	id AA26095; Tue, 19 Jul 94 16:55:15 +0200
From: frystyk@ptsun00.cern.ch (Henrik Frystyk Nielsen)
Received: by ptsun03.cern.ch (4.1/client-1.5)
	id AA27417; Tue, 19 Jul 94 16:55:21 +0200
Date: Tue, 19 Jul 94 16:55:21 +0200
Message-Id: <9407191455.AA27417@ptsun03.cern.ch>
To: www-lib@www0.cern.ch
Subject: Re: Split libwww from client, ANSI C, and Forms in libwww 3.0
content-length: 4198

Hi

Some comments to the mails from Garrett:

> I don't think those of us that distribute source should be shipping our 
> variant WWW libraries with it.
> This pratice is probably the major reason at fault of the libwww 
> fragmentation over clients.
> 
> Note that we all haven't hacked up the freeWAIS library that is 
> supported; maybe because it's optional, maybe because we didn't go 
> changing it to be specific to our client.
> 
> Adopting this policy would place the burden of providing client side 
> hooks up to the libwww group.  Again, we come to this idea of system 
> independent API, but now it is system/client independent API; this is 
> already partially done via the GridText functions, but there are other 
> places that a client likes to stick its nose in, too.
> 
> Not doing so will continue to support the idea that we can hack up our 
> personal libwww just for our client.

I think this is a very ideal situation but I doubt that it will happen
over night. Many of the optimizations in the various library versions
are really good and I hope it is possible to combine them into a
powerfull tool. This is one of the reasons for this mailing list!

BTW: I _have_ hacked the freeWAIS library in order to make it work on
Solaris ;-)

> Was there ever a clear decision wether or not the libwww would be 
> completely ANSI C in the future?

The CERN library does already use some ANSI C features that are not
supported by an old C compiler. Examples are void and enum. My feeling
is that it is not so much a question of ANSI C or not but to find a
minimum set of C that can be used on most platforms. Even ANSI C
compiles behave in funny ways! I don't think that it will ever be
possible to use fancy ANSI features like variable-length argument lists
etc. - they simply don't compile on many platforms.

I agree that the ARGS and PARAMS do look strange, but I think we should
make sure not to lose any platforms currently supported before we
remove them. Though, replacements _are_ welcome! Please look at the
page to see a list of platforms supported

	http://info.cern.ch/hypertext/WWW/Library/User/Platform.html

Furthermore SCO and MIPS are close to be added as well. So again - no
golden solution to the problem, but I think that it is important to
support a large set of platforms if it is going to be a consistent
library that people don't have to hack.

> 	Looked at the infomation out at CERN regarding the status of 
> libwww.  Perhaps I missed it, but I read nothing about forms or their 
> implementation inside of the libwww.  I am wondering if there is a 
> standard forms API inside of libwww, or if there will be in the future.  
> Also, I was wondering what level of HTML will the libwww support (along 
> the same lines).
> 
> 	Are these things considered to be a client implementation?

Generally - the HTML side of the library hasn't been touched for a long
time.  We hope that this is going to change as Hakon Lie, has started
to look into it. That is also the reason for the lacking support of
post.

About posting, I see two different sides of it:

	- The interface between the library and the clients
	- The HTTP implementation of PUT and POST

I have been working on a general posting interface that can be used not
only for HTTP, but also NNTP and SMTP (NNTP posting is almost
integrated but no SMTP support currently in the Library). As forms are
a special case of posting, it also goes for this. Please see the
specifications on

   http://info.cern.ch/hypertext/WWW/Library/User/Features/ClientPost.html

This is a sub document of the page

   http://info.cern.ch/hypertext/WWW/Library/User/Features/Implementation.html

I have also been looking into how it could be done more elegant in the
HTTP protocol, see

   http://info.cern.ch/hypertext/WWW/Library/User/Features/HTTPPost.html

but this is not yet finished. Please give me your comments on the pages!

I have been working on an actual implementation of the posting
interface but as I now have less than 2.5 weeks to finish my master
thesis I have to do this instead :-(. My plan is to write my thesis on
the Web so what you are reading is actually a draft of it!

-- cheers --

Henrik



eturn-Path: connolly@hal.com 
Return-Path: <connolly@hal.com>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA24535; Tue, 19 Jul 1994 17:33:29 +0200
Received: from hal.hal.COM by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA08932; Tue, 19 Jul 1994 17:33:28 +0200
Received: from ulua.hal.com by hal.com (4.1/SMI-4.1.1)
	id AA12653; Tue, 19 Jul 94 08:33:25 PDT
Received: from localhost by ulua.hal.com (4.1/SMI-4.1.2)
	id AA00977; Tue, 19 Jul 94 10:33:56 CDT
Message-Id: <9407191533.AA00977@ulua.hal.com>
To: frystyk@ptsun00.cern.ch
Cc: www-lib@www0.cern.ch
Subject: Re: Split libwww from client, ANSI C, and Forms in libwww 3.0 
In-Reply-To: Your message of "Tue, 19 Jul 1994 17:00:49 +0200."
             <9407191455.AA27417@ptsun03.cern.ch> 
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Id: <975.774632034.1@ulua>
Date: Tue, 19 Jul 1994 10:33:55 -0500
From: "Daniel W. Connolly" <connolly@hal.com>
content-length: 3777

In message <9407191455.AA27417@ptsun03.cern.ch>, Henrik Frystyk Nielsen writes:
>
>The CERN library does already use some ANSI C features that are not
>supported by an old C compiler. Examples are void and enum. My feeling
>is that it is not so much a question of ANSI C or not but to find a
>minimum set of C that can be used on most platforms. Even ANSI C
>compiles behave in funny ways! I don't think that it will ever be
>possible to use fancy ANSI features like variable-length argument lists
>etc. - they simply don't compile on many platforms.

I don't think it's time for the "functionality freeze for and
porting integration" phase of libWWW development. The dang
thing has never had a decent design review! And there are no
regression tests.

There are some structural issues in the library that I'd like
to see resolved before we start saying "It can't change because
we'll break wierd client XXX, YYY, and ZZZ."

I can't seem to find the time to integrate my changes into
the released libWWW, but...

	* Interaction between HTFormat and HTRequest, e.g.
	HTStreamStack: the www/present notion isn't sufficiently
	general. I suggest an alternative to where client apps
	register sets of
		(format, quality, cost, callback, callback_arg)
	for each of the types that they understand natively.
	It's hard to explain, but I've got the code working
	(in Python, but the idea will port to C).

	See http://www.hal.com/%7Econnolly/pywww/HTFormat.py

	We can get rid of all the www/* stuff: www/unknown,
	www/present. HTGuessStream becomes
	HTFormat_infer(filename, first_few_bytes) and gets
	called by the protocol module, if necessary.

	* There are a lot of unnecessary .h files included in
	lots of modules. e.g. every module includes HTUtils.h,
	which includes tcp.h, so every module depends on all
	sorts of wiered networking #ifdefs and such even though
	it might be something like HTChunk.c which uses nothing
	outside the ANSI C standard library.

	* The protocol modules are still pretty monolithic. There
	should be an FTP.c module that just does FTP, and doesn't
	know anything about HTML, MIME, or anything else. The
	python library includes such a module in ftplib.py.
	On top of that, I built FTPLoad.py:

	See http://www.hal.com/%7Econnolly/pywww/FTPLoad.py

	(yes, it works with SOCKs. Python has a socket module,
	and I developed a work-alike module that uses SOCKS,
	and modified ftplib.py slightly to do things in
	a SOCKS-friendly order.)

	* The makefiles are AFU. Use Imake or GNU autoconf please.
	Do whatever Tcl/Tk does. That is one package that is
	clean, general, and easy to use. We should use it as
	a model for lots of things.

I'm prototyping this python version of the libWWW API using a python
interpreter with Motif and the Mosaic widget linked in, plus 
HTFileWriter.c, SGML.c, and a few other libwww modules linked in
with a little glue code.

The problem is that this is a "skunkworks" project, so I can't seem
to get caught up to any released version of libWWW.

I had all the .h for 2.15 clean up w.r.t ANSI/POSIX, and then 2.16
came out with all new .h files built from .html files. I don't have
a linemode browser handy that I trust to do the .html->.h correctly.
So I'm stuck for a while until I get that installed.

I tried to build the latest linemode browser from source, but it
barfed with an undeclared identfier somewhere, and using that
BUILD script and twelve different directories to build the dang
thing is such a pain for development work that I didn't even look
into the build problem any more.

I realize there's a lot of whining in this message, but I'm a
little frustrated because I've thought about this a lot, and I'm
quite sure I've got good solutions, but I don't have the resources
to package it all up.


Dan


eturn-Path: eric@spyglass.com 
Return-Path: <eric@spyglass.com>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA07602; Tue, 19 Jul 1994 18:17:31 +0200
Received: from [192.246.238.10] by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA22125; Tue, 19 Jul 1994 18:17:27 +0200
Received: by spyglass.com (5.57/3.1.090690-Spyglass)
	id AA23540; Tue, 19 Jul 94 11:17:44 -0500
Received: by hook.spyglass.com (5.57/3.1.090690-Spyglass)
	id AA00577; Tue, 19 Jul 94 11:17:42 -0500
Message-Id: <9407191617.AA00577@hook.spyglass.com>
X-Sender: eric@hook
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Date: Tue, 19 Jul 1994 11:17:37 -0600
To: www-lib@www0.cern.ch
From: eric@spyglass.com (Eric W. Sink)
Subject: Re: Forms in libwww 3.0
content-length: 799


>        Looked at the infomation out at CERN regarding the status of
>libwww.  Perhaps I missed it, but I read nothing about forms or their
>implementation inside of the libwww.  I am wondering if there is a
>standard forms API inside of libwww, or if there will be in the future.
>Also, I was wondering what level of HTML will the libwww support (along
>the same lines).
>
>        Are these things considered to be a client implementation?

The 2.15 release appeared to contain half-support for some variant of
Dave Ragget's work.  I would like to see HTMLPDTD.C changed to be
more like [draft] HTML 2.0.

Eric W. Sink, Software Engineer --  eric@spyglass.com 217-355-6000 ext 237
All opinions expressed are mine, and may not be those of my employer.

        Hakuna Patata  (no french fries)


eturn-Path: doslynx@falcon.cc.ukans.edu 
Return-Path: <doslynx@falcon.cc.ukans.edu>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA22717; Tue, 19 Jul 1994 19:13:45 +0200
Received: from kuhub.cc.ukans.edu by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA01827; Tue, 19 Jul 1994 19:13:44 +0200
Received: from falcon.cc.ukans.edu by KUHUB.CC.UKANS.EDU (PMDF V4.3-8 #5489)
 id <01HEW20BTBNK8WXTMK@KUHUB.CC.UKANS.EDU>; Tue, 19 Jul 1994 12:13:24 CDT
Received: by falcon.cc.ukans.edu; id AA01674; Tue, 19 Jul 1994 12:13:26 -0500
Date: Tue, 19 Jul 1994 12:13:25 -0500 (CDT)
From: Garrett Arch Blythe <doslynx@falcon.cc.ukans.edu>
Subject: POST and PUT  (was RE: a bunch of my mail :)
In-Reply-To: <9407191455.AA27417@ptsun03.cern.ch>
To: Multiple recipients of list <www-lib@www0.cern.ch>
Message-Id: <Pine.3.89.9407191104.E22657-0100000@falcon.cc.ukans.edu>
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Content-Transfer-Encoding: 7BIT
content-length: 3223

On Tue, 19 Jul 1994, Henrik Frystyk Nielsen wrote:
> ...
> I agree that the ARGS and PARAMS do look strange, but I think we should
> make sure not to lose any platforms currently supported before we
> remove them. Though, replacements _are_ welcome! Please look at the
> page to see a list of platforms supported
> 
> 	http://info.cern.ch/hypertext/WWW/Library/User/Platform.html

Yes!  A fine decision.
Couldn't help but notice there isn't a DOS supported.
When I get a hold of your libwww 3.0 I'll be adding this; BTW, when can I 
	lay my paws on it?
I'll be wanting to coordinate any changes that I make with the CERN group.
Who/what/where/when/how?

> ...
> Generally - the HTML side of the library hasn't been touched for a long
> time.  We hope that this is going to change as Hakon Lie, has started
> to look into it. That is also the reason for the lacking support of
> post.
> 
> About posting, I see two different sides of it:
> 
> 	- The interface between the library and the clients
> 	- The HTTP implementation of PUT and POST
> 
> I have been working on a general posting interface that can be used not
> only for HTTP, but also NNTP and SMTP (NNTP posting is almost
> integrated but no SMTP support currently in the Library). As forms are
> a special case of posting, it also goes for this. Please see the
> specifications on
> 
>    http://info.cern.ch/hypertext/WWW/Library/User/Features/ClientPost.html
> <other URLs...>

Let's discuss this POST and PUT information, just for clarity if nothing
else.  When doing either in HTTP, you'll be able to specify multiple
destinations for the data.  If post transactions require a URI returned,
how is this handled by the libwww when multiples come back? Or am I off
base here?  Note, I understand that the URI is no guarantee of any action
taken by the server.  Does the put return a URI also, or just
acknowledgement of a change to the URI sent by the client? 

The nntp post is handled all in one transaction, so only one message ID 
is given back from an nntp post; even if multiple newsgroups were 
targeted because nntp can handle this all in one request.  Good.

The sentto URI is a decent idea, but why not depend upon SMTP to return 
to the user a message that never reached it's destination?  Also, 
shouldn't this be similar to the nntp idea, since you can specify 
multiple recipients of the message BODY while doing the transaction with 
SMTP.  If so, how does the sentto URI cover multiple SMTP targets?

I think the FTP POST is a needed feature in your design.  I know that it 
is going to be very open so that it can be added in the future.  I just 
wanted to give my vote now.

> I have been working on an actual implementation of the posting
> interface but as I now have less than 2.5 weeks to finish my master
> thesis I have to do this instead :-(. My plan is to write my thesis on
> the Web so what you are reading is actually a draft of it!

Good luck, Henrick!


Dzai Jyan,
	Garrett.

Trodden Soil

I am trodden soil.
Dust covers my face.
Soles crush my nature
Revealing a hard empty space.

Garrett Arch Blythe  (913)864-0436
User Services Student Programmer/Consultant
University of Kansas Computer Center
<doslynx@falcon.cc.ukans.edu>

eturn-Path: connolly@hal.com 
Return-Path: <connolly@hal.com>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA05631; Thu, 21 Jul 1994 01:43:09 +0200
Received: from hal.hal.COM by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA09843; Thu, 21 Jul 1994 01:43:09 +0200
Received: from ulua.hal.com by hal.com (4.1/SMI-4.1.1)
	id AA17203; Wed, 20 Jul 94 16:43:07 PDT
Received: from localhost by ulua.hal.com (4.1/SMI-4.1.2)
	id AA09571; Wed, 20 Jul 94 18:43:38 CDT
Message-Id: <9407202343.AA09571@ulua.hal.com>
To: www-lib@www0.cern.ch
Subject: What to do when malloc() returns 0?
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Id: <9569.774747816.1@ulua>
Date: Wed, 20 Jul 1994 18:43:37 -0500
From: "Daniel W. Connolly" <connolly@hal.com>
content-length: 5282

Currently, a 0 return from malloc() is considered a fatal
error throughout wwwlib. The error is reported to stderr,
and the process exits.

For some applications, this is unsuitable. A user may
be editing some data, and exiting in this manner will
cause the data to be lost. As long as we're writing
read-only tools like browsers, this isn't such a big deal.
But as we start building annotation and full editing
applications, it will become more and more of an issue.

The optimal behaviour of the library would be for all API
entry points that might call malloc() to include a failure
return mode. Unfortunately, this would require significant
re-engineering of the library -- not to mention the impact
on future work.

There is a precedent in the Xt library. It also considers
0 from malloc() a fatal error, but it provides a "hook"
function so that applications can do a non-local return
(longjump) up through the Xt code and back to the
application. The Xt spec doesn't guarantee that this
will work, and I'm sure some memory leaks would result
from this behaviour. Consider:

	XtFoo()
	{
		char *xxx = XtMalloc(100);
		XtBar();
		XtFree(xxx);
	}

	XtBar()
	{
		char *yyy = XtMalloc(200);
		XtFree(yyy);
	}

If the xxx alloc succeeds, but the yyy malloc fails, and
the XtError handler lonjumps up through XtFoo, xxx will
never be freed: memory leak.


The DCE programming API includes a longjump-based exception
implementation, so code can be written like this:

	DCEFoo()
	{
		char *xxx = malloc(100);

		if(!xxx) RAISE(OutOfMemory);

		TRY{
			DCEBar();
		}FINALLY{
			free(xxx);
		}
	}

	DCEBar()
	{
		char *yyy = malloc(200);

		if(!yyy) RAISE(OutOfMemory);
		free(yyy);
	}

That way, if the xxx alloc succeeds, even if DCEBar raises
and exception, xxx will be freed as the stack is unwound.

In summary, here are some proposals for what the library
code should do when malloc() returns 0:

	A. Fatal exit.
		+ easy to code
		- unsuitable for editing apps
		- unsuitable for DOS/Windows, other small-memory machines
		- makes the library unusable from otherwise robust
			applications

	B. Call fatal-exit hook function
		+ easy to code
		+ allows editing apps to do "last ditch save"
		+ allows graceful exit on small-memory machines
		- still unsuitable for integration with otherwise
			robust apps

	C. Raise an exception -- export exception API to library
			clients
		- requires significant library reengineering
		- requires integration of external technology
			(who's exception package do we use?)
		- requres library clients to use exceptions API
		+ allows robust editing applications to use the library

	D. Raise an exception, catch it before exiting the API,
		return an error code
		- requires significant library reengineering
		- requires integration of external technology
			(who's exception package do we use?)
		- requires distinction between "public" API and
			"internal" API
		- requires library clients to do lots of error checking
		+ suitable for use in robust applications

	E. Support "malloc failed" return code throughout the library
		- requires significant library reengineering
		- error prone
		- requires library clients to do lots of error checking
		+ suitable for use in robust applications


Enclosed is an interface I've used successfully to build applications.
Coding to this interfaces allows proposal A or B.
It allows all sorts of heap tracing (for the purify-less),
or none, and you can turn the tracing on and off by recompiling
just the .o file that implements mem_*(), without recompiling
the rest of the library.

It's used in stead of malloc(), free(), and strdup() from <stdlib.h>.

/* memalloc.h -- coding idioms for heap allocation
 * $Id: memalloc.h,v 1.3 1994/05/11 23:23:21 connolly Exp $
 */

#ifndef __memalloc_h
#define __memalloc_h

#include "stdclang.h"
#include <stddef.h> /* for size_t */

/* does NOT return 0 */
#define CNEW(type, annotation)     ((type*)mem_alloc(sizeof(type), 		      annotation,  __FILE__, __LINE__))

/* does NOT return 0 */
#define CNEWSIZE(type, qty, annotation)     ((type*)mem_alloc(sizeof(type)*(qty), 		      annotation, __FILE__, __LINE__))

/* does NOT return 0 */
#define CRESIZE(type, qty, old, annotation)     ((type*)mem_realloc(sizeof(type)*(qty), old, 			annotation, __FILE__, __LINE__))

/* does NOT return 0 */
#define STRDUP(str)     mem_strdup(str, "STRDUP", __FILE__, __LINE__)

/* null str allowed */
/* does NOT return 0 on allocation failure*/
#define ZSTRDUP(str)     mem_zstrdup(str, "ZSTRDUP", __FILE__, __LINE__)

#define DEALLOC(old)     mem_dealloc(old, __FILE__, __LINE__)

/* NOTE: this evaluates its arg more than once */
#define DISPOSE(old) do{ if(old){ DEALLOC(old); old = NULL; } }while(0)

extern void* mem_alloc PARAMS((size_t bytes,
				 CONST char *annotation,
				 CONST char* file, int line));

extern void* mem_realloc PARAMS((size_t bytes, void* old,
				   CONST char *annotation,
				   CONST char* file, int line));
extern void  mem_dealloc PARAMS((void* old,
				 CONST char* file, int line));

extern char* mem_strdup PARAMS((CONST char*,
				CONST char *annotation,
				CONST char* file, int line));

extern char* mem_zstrdup PARAMS((CONST char*,
				 CONST char *annotation,
				 CONST char* file, int line));

#endif /* __memalloc_h */
eturn-Path: doslynx@falcon.cc.ukans.edu 
Return-Path: <doslynx@falcon.cc.ukans.edu>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA24829; Thu, 21 Jul 1994 03:16:56 +0200
Received: from kuhub.cc.ukans.edu by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA14898; Thu, 21 Jul 1994 03:16:52 +0200
Received: from falcon.cc.ukans.edu by KUHUB.CC.UKANS.EDU (PMDF V4.3-8 #5489)
 id <01HEXX5W1BB48WZN6O@KUHUB.CC.UKANS.EDU>; Wed, 20 Jul 1994 20:16:42 CDT
Received: by falcon.cc.ukans.edu; id AA05441; Wed, 20 Jul 1994 20:16:46 -0500
Date: Wed, 20 Jul 1994 20:16:45 -0500 (CDT)
From: Garrett Arch Blythe <doslynx@falcon.cc.ukans.edu>
Subject: Re: What to do when malloc() returns 0?
In-Reply-To: <9407202343.AA09571@ulua.hal.com>
Sender: Garrett Arch Blythe <doslynx@falcon.cc.ukans.edu>
To: Multiple recipients of list <www-lib@www0.cern.ch>
Reply-To: Garrett Arch Blythe <doslynx@falcon.cc.ukans.edu>
Message-Id: <Pine.3.89.9407201909.A3851-0100000@falcon.cc.ukans.edu>
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; CHARSET=US-ASCII
Content-Transfer-Encoding: 7BIT
content-length: 3143

On Thu, 21 Jul 1994, Daniel W. Connolly wrote:
> Currently, a 0 return from malloc() is considered a fatal
> error throughout wwwlib. The error is reported to stderr,
> and the process exits.
> 
> For some applications, this is unsuitable. A user may
> be editing some data, and exiting in this manner will
> cause the data to be lost. As long as we're writing
> read-only tools like browsers, this isn't such a big deal.
> But as we start building annotation and full editing
> applications, it will become more and more of an issue.
> 
>	....
> 
> In summary, here are some proposals for what the library
> code should do when malloc() returns 0:
> 
>	....

How C++ handles this is of worth noting here.  Of course, we won't use 
C++, just consider what it does for usage in libwww.

First, when the new (malloc) operator fails to allocate memory, it calls 
the new_handler function (pointer to a function that you design specific 
to the client; default is to just exit() the app).

The new_handler's purpose is to attempt to free off one allocated 
piece of memory that it can and return; if it can't, then exit() the app.
  
In this manner, successive allocation attempts inside of new will occur 
constantly calling new_handler to free up more memory either until 
allocation succeeds or the application is exited.

In DosLynx, as an example, the new_handler when called due to a failure of
new allocating more memory (all memory requests are sent through new), the
scenario is as follows in a simpler C equivalent (fill in the gaps).  It's
a bit of reading, but well worth it to understand what I'm talking about. 



/*	Allocation handler should be client specific.
 */
/*	Overriding
 *	void (*new_handler)(void) = NULL;
 */
void (*new_handler)(void) = my_handler;

/*	Allocate some memory.
 */
void *new(size_t st_bytes)	{
	void *vp_alloced = NULL;

	while(vp_alloced == NULL)	{
		vp_alloced = malloc(st_bytes);
		if(vp_alloced == NULL)	{
			if(new_handler == NULL)	{
				/*	No new handler defined.
				 *	Exit only, nothing else!
				 */
				exit(-1);
			}
			new_handler();
		}
		else	{
			return(vp_alloced);
		}
	}
}

/*	Handle failure to allocate some memory.
 */
void my_handler(void)	{
	/*	Call another funciton to free some memory that isn't needed.
	 */
	if(free_off_some_memory() == FALSE)	{
		/*	Unable to free stuff off.
		 *	Have the application due some final shutdown code and
		 *		then exit.
		 */
		cleanup_before_exit();
		exit(-1);
	}

	/*	Success!
	 *	Just return, if enough wasn't freed off, we'll be called
	 *		again.
	 */
}

/*	Function to release spare memory.
 *	This is really client specific stuff....
 *	This is only an example.
 */
void free_off_some_memory(void)	{
	HText *to_be_free;

	to_be_free = getOldCachedDoc();

	if(to_be_free != NULL)	{
		HText_free(to_be_free);
	}
	else	{
		/*	Free other memory if possible here.
		 */
	}
}



Garrett.

Trodden Soil

I am trodden soil.
Dust covers my face.
Soles crush my nature
Revealing a hard empty space.

Garrett Arch Blythe  (913)864-0436
User Services Student Programmer/Consultant
University of Kansas Computer Center
<doslynx@falcon.cc.ukans.edu>

eturn-Path: doslynx@falcon.cc.ukans.edu 
Return-Path: <doslynx@falcon.cc.ukans.edu>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA27137; Thu, 21 Jul 1994 03:25:52 +0200
Received: from kuhub.cc.ukans.edu by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA15244; Thu, 21 Jul 1994 03:25:52 +0200
Received: from falcon.cc.ukans.edu by KUHUB.CC.UKANS.EDU (PMDF V4.3-8 #5489)
 id <01HEXXH4XXUO8WZSUU@KUHUB.CC.UKANS.EDU>; Wed, 20 Jul 1994 20:25:45 CDT
Received: by falcon.cc.ukans.edu; id AA05705; Wed, 20 Jul 1994 20:25:52 -0500
Date: Wed, 20 Jul 1994 20:25:52 -0500 (CDT)
From: Garrett Arch Blythe <doslynx@falcon.cc.ukans.edu>
Subject: Re: What to do when malloc() returns 0?
In-Reply-To: <Pine.3.89.9407201909.A3851-0100000@falcon.cc.ukans.edu>
To: Multiple recipients of list <www-lib@www0.cern.ch>
Message-Id: <Pine.3.89.9407202054.B3851-0100000@falcon.cc.ukans.edu>
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Content-Transfer-Encoding: 7BIT
content-length: 932

On Thu, 21 Jul 1994, Garrett Arch Blythe wrote:
> /*	Function to release spare memory.
>  *	This is really client specific stuff....
>  *	This is only an example.
>  */
> void free_off_some_memory(void)	{
> 	HText *to_be_free;
> 
> 	to_be_free = getOldCachedDoc();
> 
> 	if(to_be_free != NULL)	{
> 		HText_free(to_be_free);
> 	}
> 	else	{
> 		/*	Free other memory if possible here.
> 		 */
> 	}
> }

Sorry,
	The above should have been:

BOOL free_off_some_memory(void)	{
	HText *to_be_free;

	to_be_free = getUnusedCachedDoc();

	if(to_be_free != NULL)	{
		HText_free(to_be_free);
		return(TRUE);
	}
	else	{
		/*	Free other memory if possible here.
		 */
	}

	return(FALSE);
}

Trodden Soil

I am trodden soil.
Dust covers my face.
Soles crush my nature
Revealing a hard empty space.

Garrett Arch Blythe  (913)864-0436
User Services Student Programmer/Consultant
University of Kansas Computer Center
<doslynx@falcon.cc.ukans.edu>

eturn-Path: hallam@alws.cern.ch 
Return-Path: <hallam@alws.cern.ch>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA11624; Thu, 21 Jul 1994 12:30:42 +0200
Received: from ALWS.DECnet MAIL11D_V3 by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA24953; Thu, 21 Jul 1994 12:30:42 +0200
Date: Thu, 21 Jul 1994 12:30:41 +0200
Message-Id: <9407211030.AA24953@dxmint.cern.ch>
From: hallam@alws.cern.ch
X-Vms-To: DXMINT::www-lib@www0.cern.ch
X-Vms-Cc: HALLAM
Subject: RE: What to do when malloc() returns 0?
X-Mail11-Ostype: VAX/VMS
Apparently-To: <www-lib@www0.cern.ch>
content-length: 2717

Its more than just memory we should think of. There should be a clear idea
of what error conditions have occured.



	E. Support "malloc failed" return code throughout the library
		- requires significant library reengineering
		- error prone
		- requires library clients to do lots of error checking
		+ suitable for use in robust applications


This is what I do with all my code. I use a set of conventions:-

1) Every proceedure is coded as a function returning an integer status code.

2) Every proceedure allocates memory for parameters returned as strings

3) Status return codes are maintained through a catalogue which is updated
	through an AWK script. This will soon be changed to a proper 
	processor.


To make the stuff portable between UNIX and VMS I use macros. Basicaly these
add facilities that should have been in C...


ENTRY_POINT PHB_test (int input, CONST char *input2,
              int &output, char **output2) {

    int 	i;

    BEGIN;

    PRE (PHB_BAD_INPUT, input > 0);
    PRE (PHB_NULL_STRING, input2 != NULL);

    STATUS = PHB_ALLOCATE (char, 32, output2);
    CHECK_STATUS;

    END;
    }


#define BEGIN int _status
#define END return (_status);
#define PRE(x,y) if (y) return x;

At the moment I use a fairly simple minded system where I simply translate
CHECK_STATUS to return (_status); 


But with the same macros you can expand the range quite a bit.  For example 
splice in a label into the END macro. That could then expand to something
that frees all the possible memory elements allocated.

At the moment I use a different scheme whereby all memory is allocated into
a registry which can be freed as a whole. This scheme requires use of a data 
model though.



We could have :-

ENTRY_POINT PHB_test (int input, CONST char *input2,
              int &output, char **output2) {

    int 	i;
    char        *alloc = NULL;

    BEGIN;

    PRE (PHB_BAD_INPUT, ("input"), input > 0);
    PRE (PHB_NULL_STRING, ("input2"), input2 != NULL);

    STATUS = PHB_ALLOCATE (char, 32, &all);
    CHECK_STATUS;

    END ((alloc));
    }


The error condition checking macros now have parameters, these are attached to 
the status return code structure. The end macro calls a proceedure to free
all the elements in the list if the abort status flag is set. Alternatively
it could have two lists, one that is always freed and one that is only freed
if an error occurs.

Henrik has an interesting scheme wherby he attaches all allocated memory to 
structures. The release routine for the structure is responsible for freeing
it.


I think we should in any case define a `public' and a `private' API. Otherwise
there will be no oportunity for other vendors to offer products.



Phill.
eturn-Path: eric@spyglass.com 
Return-Path: <eric@spyglass.com>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA10050; Thu, 21 Jul 1994 15:10:21 +0200
Received: from spyglass.com by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA11674; Thu, 21 Jul 1994 15:10:20 +0200
Received: by spyglass.com (5.57/3.1.090690-Spyglass)
	id AA28510; Thu, 21 Jul 94 08:11:14 -0500
Received: by hook.spyglass.com (5.57/3.1.090690-Spyglass)
	id AA08890; Thu, 21 Jul 94 08:11:12 -0500
Message-Id: <9407211311.AA08890@hook.spyglass.com>
X-Sender: eric@hook
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Date: Thu, 21 Jul 1994 08:11:11 -0600
To: www-lib@www0.cern.ch
From: eric@spyglass.com (Eric W. Sink)
Subject: Re: What to do when malloc() returns 0?
content-length: 760


>        E. Support "malloc failed" return code throughout the library
>                - requires significant library reengineering
>                - error prone
>                - requires library clients to do lots of error checking
>                + suitable for use in robust applications

I cast my vote for this.  A and B are not robust, and both C and D
also require significant library reengineering.  E is clean, can be
done readably, and very portable.

I agree with Phillip too, the library should propagate errors of all
kinds, not just malloc failures.

Eric W. Sink, Software Engineer --  eric@spyglass.com 217-355-6000 ext 237
All opinions expressed are mine, and may not be those of my employer.

        Hakuna Patata  (no french fries)


eturn-Path: redman@ncsa.uiuc.edu 
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA13716; Thu, 21 Jul 1994 15:24:45 +0200
Received: from newton.ncsa.uiuc.edu by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA13407; Thu, 21 Jul 1994 15:24:44 +0200
Received: from void.ncsa.uiuc.edu by newton.ncsa.uiuc.edu with SMTP id AA03890
  (5.65a/IDA-1.4.2 for www-lib@www0.cern.ch); Thu, 21 Jul 94 08:24:34 -0500
Return-Path: <redman@ncsa.uiuc.edu>
Received: from  by void.ncsa.uiuc.edu (4.1/NCSA-4.1)
	id AB10190; Thu, 21 Jul 94 08:23:03 CDT
Message-Id: <9407211323.AB10190@void.ncsa.uiuc.edu>
X-Sender: redman@ncsa.uiuc.edu (Unverified)
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Date: Thu, 21 Jul 1994 08:24:42 -0800
To: www-lib@www0.cern.ch
From: redman@ncsa.uiuc.edu (Thomas Redman)
Subject: Re: What to do when malloc() returns 0?
content-length: 2617

>        A. Fatal exit.
>                + easy to code
>                - unsuitable for editing apps
>                - unsuitable for DOS/Windows, other small-memory machines
>                - makes the library unusable from otherwise robust
>                        applications
>
>        B. Call fatal-exit hook function
>                + easy to code
>                + allows editing apps to do "last ditch save"
>                + allows graceful exit on small-memory machines
>                - still unsuitable for integration with otherwise
>                        robust apps
>
>        C. Raise an exception -- export exception API to library
>                        clients
>                - requires significant library reengineering
>                - requires integration of external technology
>                        (who's exception package do we use?)
>                - requres library clients to use exceptions API
>                + allows robust editing applications to use the library
>
>        D. Raise an exception, catch it before exiting the API,
>                return an error code
>                - requires significant library reengineering
>                - requires integration of external technology
>                        (who's exception package do we use?)
>                - requires distinction between "public" API and
>                        "internal" API
>                - requires library clients to do lots of error checking
>                + suitable for use in robust applications
>
>        E. Support "malloc failed" return code throughout the library
>                - requires significant library reengineering
>                - error prone
>                - requires library clients to do lots of error checking
>                + suitable for use in robust applications
>
Yup, I think Dan pretty much hit the nail on the head here! This is a
significant deficiency of the current libwww. Not so much a problem for the
unix type boxes, but a major deficiency elsewhere. As far as I am
concerned, if my program exits, it is a bug (unfortunately, I program a
Mac). The setjmp, longjump solution is very close to how I work around this
problem right now (I jump all the way out of the library, very dirty), but
the cleanup problem can't be ignored. I think a re-engineering of the
mechanism is in order. A DCE style solution looks very attractive to me.

Thomas Redman (redman@ncsa.uiuc.edu)
Software Development Group, National Center for Supercomputing Applications
University of Illinois, Urbana-Champaign
(217) 244-0781; fax (217) 244-1987


eturn-Path: frystyk@ptsun00.cern.ch 
Return-Path: <frystyk@ptsun00.cern.ch>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA29507; Thu, 21 Jul 1994 16:15:04 +0200
Received: from ptsun00.cern.ch by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA20391; Thu, 21 Jul 1994 16:15:04 +0200
Received: from ptsun03.cern.ch by ptsun00.cern.ch (4.1/SMI-4.1)
	id AA24508; Thu, 21 Jul 94 16:14:16 +0200
From: frystyk@ptsun00.cern.ch (Henrik Frystyk Nielsen)
Received: by ptsun03.cern.ch (4.1/client-1.5)
	id AA28537; Thu, 21 Jul 94 16:14:24 +0200
Date: Thu, 21 Jul 94 16:14:24 +0200
Message-Id: <9407211414.AA28537@ptsun03.cern.ch>
To: www-lib@www0.cern.ch
Subject: Re: What to do when malloc() returns 0?
content-length: 4095

Hi

> Currently, a 0 return from malloc() is considered a fatal
> error throughout wwwlib. The error is reported to stderr,
> and the process exits.

It's not to say that it's the right way to do it but I havn't seen any
program not starting to behave strangely when the memory limit is
reached: the mouse disappers, windows start blinking, keyboard dies
etc.

> 	E. Support "malloc failed" return code throughout the library
> 		- requires significant library reengineering
> 		- error prone
> 		- requires library clients to do lots of error checking
> 		+ suitable for use in robust applications

:-( This is what I have been reading in every `how to be a good
programmer' book. In practice I have _never_ seen return codes used in
a consistent manner - especially when they are a result of something
that happened more than one level below the current function.

> 	C. Raise an exception -- export exception API to library
> 			clients
> 		- requires significant library reengineering
> 		- requires integration of external technology
> 			(who's exception package do we use?)
> 		- requres library clients to use exceptions API
> 		+ allows robust editing applications to use the library
> 
> 	D. Raise an exception, catch it before exiting the API,
> 		return an error code
> 		- requires significant library reengineering
> 		- requires integration of external technology
> 			(who's exception package do we use?)
> 		- requires distinction between "public" API and
> 			"internal" API
> 		- requires library clients to do lots of error checking
> 		+ suitable for use in robust applications

:-( I don't like exceptions in the library due to portability problems -
sorry for saying this word again ;-)

> 	B. Call fatal-exit hook function
> 		+ easy to code
> 		+ allows editing apps to do "last ditch save"
> 		+ allows graceful exit on small-memory machines
> 		- still unsuitable for integration with otherwise
> 			robust apps

:-) This is what I would call to take the easy way out. However, I
think it is the most realistic way right now. Two reasons: The library
is not consistent enough at the moment to have a very suffisticated and
gracefull exit on problems like this. Secondly, I think that once a
memory problem has occured new ones are likely to appear very soon
after. If we can prevent loss of data using a call back function then I
think it is a good and simple step to do.

Another thing: I don't like malloc at all. We have had so many problems
with uninitialized memory when allocating structures etc. in the
Library. The only way to avoid this is to use calloc. Any memory
problems are then a lot easier to find as they always dump core on UNIX.

The new multi-threaded HTTP client has another way of using memory. All
information necessary to fulfill a request is stored in HTRequest and
HTNetInfo structures. Nothing is freed until the request is done
(success or failure). As all the protocol modules now are state
machines it makes it easier to handle fatal error such like lack of
memory as it then is a question of jumping into an error state. This
might also be used in other parts of the library.

*******

OK - to summarize the working list for version 3.0:

	- Better handling of malloc (calloc)
	- free static memory on exit
	- replacement of verbose output to stderr
	- Integration of DOS (system dependent stuff etc.)
	- take a look at all system calls
	- Better SGML/HTML parsing
	- Better MIME parsing
	- Multi threaded HTTP client (alpha version exists)
	- Cleaning up include files
	- chop filenames to 8.3 format for DOS
	- Introduction of PUT and POST (specification exists)
	- FORMs support in the library
	- canonicalising URLs so that
	
		www.fit.qut.edu.au = www.fit.qut.edu.au:80 = www.fit.qut.edu.AU

	  etc. all have the same anchor
	- and some minor stuff on my list...

DID I MISS ANYTHING??? I will put the items onto a WEB page one of these days.

However, as I said in my last mail - I need to write my thesis _*NOW_*
so I can't do any work on the library the next 2-3 weeks. So - please
bear with me...

-- cheers --

Henrik Frystyk

eturn-Path: ja@lithnext.epfl.ch 
Return-Path: <ja@lithnext.epfl.ch>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA22960; Fri, 22 Jul 1994 12:14:59 +0200
Received: from sicmail.epfl.ch by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA00933; Fri, 22 Jul 1994 12:14:58 +0200
Received: from lithnext.epfl.ch by sicmail.epfl.ch with SMTP (PP) 
          id <11898-0@sicmail.epfl.ch>; Fri, 22 Jul 1994 12:14:51 +0200
Received: from lithnext4 by lithnext.epfl.ch (NX5.67d/NX3.0M) id AA02355;
          Fri, 22 Jul 94 12:13:02 +0100
From: Jean Alexis Montignies <ja@lithnext.epfl.ch>
Message-Id: <9407221113.AA02355@lithnext.epfl.ch>
Received: by lithnext4.epfl.ch (NX5.67d/NX3.0X) id AA00596;
          Fri, 22 Jul 94 12:11:44 +0100
Date: Fri, 22 Jul 94 12:11:44 +0100
Original-Received: by NeXT.Mailer (1.100)
Pp-Warning: Illegal Received field on preceding line
Original-Received: by NeXT Mailer (1.100)
Pp-Warning: Illegal Received field on preceding line
To: www-lib@www0.cern.ch
Subject: Bug repport, 2.15 library, modeule HTFile.c
Reply-To: ja@lithnext.epfl.ch
content-length: 768

In the module HTFile.c from the CERN WWW library, version 2.15, there are these  
lines.

 /*	Directory access is allowed and possible
 */
		logical = HTAnchor_address((HTAnchor*)anchor);
		tail = strrchr(logical, '/') +1;	/* last part or "" */

but when there is no '/' in the string, strrchr returns NILL, wich become  
0x0001.
This made my browser stop on the URL 'file:'.

I've replaced the line:
    'tail = strrchr(logical, '/') +1;	/* last part or "" */'
by (this is a quick hack):

	{ char *tempTail;
          if (tempTail = strrchr(logical, '/'))	
		    tail=tempTail+1;
           else
            tail=tail+strlen(tail);
        }

I've also seen that there are other similar uses of the 'strrchr' function.  
This may cause some other bugs.


Jean-Alexis
eturn-Path: ja@lithnext.epfl.ch 
Return-Path: <ja@lithnext.epfl.ch>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA23518; Fri, 22 Jul 1994 12:20:44 +0200
Received: from sicmail.epfl.ch by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA01379; Fri, 22 Jul 1994 12:20:43 +0200
Received: from lithnext.epfl.ch by sicmail.epfl.ch with SMTP (PP) 
          id <12080-0@sicmail.epfl.ch>; Fri, 22 Jul 1994 12:20:21 +0200
Received: from lithnext4 by lithnext.epfl.ch (NX5.67d/NX3.0M) id AA02363;
          Fri, 22 Jul 94 12:18:33 +0100
From: Jean Alexis Montignies <ja@lithnext.epfl.ch>
Message-Id: <9407221118.AA02363@lithnext.epfl.ch>
Received: by lithnext4.epfl.ch (NX5.67d/NX3.0X) id AA00603;
          Fri, 22 Jul 94 12:17:15 +0100
Date: Fri, 22 Jul 94 12:17:15 +0100
Original-Received: by NeXT.Mailer (1.100)
Pp-Warning: Illegal Received field on preceding line
Original-Received: by NeXT Mailer (1.100)
Pp-Warning: Illegal Received field on preceding line
To: www-lib@www0.cern.ch
Subject: need to link a structure to each anchor
Reply-To: ja@lithnext.epfl.ch
content-length: 446

I need, to program my browser to link any HTAnchor (child or parent) structure  
with a client specific structure, wich is in my case an Objective-C object id.

If the library was in Objective-C, i should have made a sub-class of the  
HTAnchor class, but that's not the case ;-).

For the moment, i've modified the HTAnchor.c module to manage one more field :  
'userData'.

I was wondering if anyone had as a better idea to do so.

Jean-Alexis
eturn-Path: frystyk@ptsun00.cern.ch 
Return-Path: <frystyk@ptsun00.cern.ch>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA28318; Fri, 22 Jul 1994 12:56:28 +0200
Received: from ptsun00.cern.ch by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA04773; Fri, 22 Jul 1994 12:56:26 +0200
Received: from ptsun03.cern.ch by ptsun00.cern.ch (4.1/SMI-4.1)
	id AA09143; Fri, 22 Jul 94 12:55:37 +0200
From: frystyk@ptsun00.cern.ch (Henrik Frystyk Nielsen)
Received: by ptsun03.cern.ch (4.1/client-1.5)
	id AA28923; Fri, 22 Jul 94 12:55:47 +0200
Date: Fri, 22 Jul 94 12:55:47 +0200
Message-Id: <9407221055.AA28923@ptsun03.cern.ch>
To: www-lib@www0.cern.ch
Subject: Re: Bug repport, 2.15 library, modeule HTFile.c
content-length: 1051

Hi

Most (if not all?) of such bugs are fixed in the 2.16 (pre 2) version.
HTFile.c is almost rewritten in this version. Generally the version
2.16 has an enhanced set of functionality compared to the 2.15
version.

-- cheers --

Henrik Frystyk


> In the module HTFile.c from the CERN WWW library, version 2.15, there are these  
> lines.
> 
>  /*	Directory access is allowed and possible
>  */
> 		logical = HTAnchor_address((HTAnchor*)anchor);
> 		tail = strrchr(logical, '/') +1;	/* last part or "" */
> 
> but when there is no '/' in the string, strrchr returns NILL, wich become  
> 0x0001.
> This made my browser stop on the URL 'file:'.
> 
> I've replaced the line:
>     'tail = strrchr(logical, '/') +1;	/* last part or "" */'
> by (this is a quick hack):
> 
> 	{ char *tempTail;
>           if (tempTail = strrchr(logical, '/'))	
> 		    tail=tempTail+1;
>            else
>             tail=tail+strlen(tail);
>         }
> 
> I've also seen that there are other similar uses of the 'strrchr' function.  
> This may cause some other bugs.
eturn-Path: frystyk@ptsun00.cern.ch 
Return-Path: <frystyk@ptsun00.cern.ch>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA00207; Fri, 22 Jul 1994 13:12:04 +0200
Received: from ptsun00.cern.ch by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA06105; Fri, 22 Jul 1994 13:12:02 +0200
Received: from ptsun03.cern.ch by ptsun00.cern.ch (4.1/SMI-4.1)
	id AA09376; Fri, 22 Jul 94 13:11:13 +0200
From: frystyk@ptsun00.cern.ch (Henrik Frystyk Nielsen)
Received: by ptsun03.cern.ch (4.1/client-1.5)
	id AA28941; Fri, 22 Jul 94 13:11:22 +0200
Date: Fri, 22 Jul 94 13:11:22 +0200
Message-Id: <9407221111.AA28941@ptsun03.cern.ch>
To: www-lib@www0.cern.ch
Subject: Re: need to link a structure to each anchor
content-length: 2082



Jean Alexis Montignies <ja@lithnext.epfl.ch> wrote:

> I need, to program my browser to link any HTAnchor (child or parent) structure  
> with a client specific structure, wich is in my case an Objective-C object id.
> 
> If the library was in Objective-C, i should have made a sub-class of the  
> HTAnchor class, but that's not the case ;-).
> 
> For the moment, i've modified the HTAnchor.c module to manage one more field :  
> 'userData'.


The definition of a parent anchor is as follows (taken from
HTAnchor.html):

struct _HTParentAnchor {
  /* Common part from the generic anchor structure */
  HTLink	mainLink;	/* Main (or default) destination of this */
  HTList *	links;  	/* List of extra links from this, if any */
  HTParentAnchor * parent;	/* Parent of this anchor (self) */

  /* ParentAnchor-specific information */
  HTList *	children;	/* Subanchors of this, if any */
  HTList *	sources;	/* List of anchors pointing to this, if any */
  HyperDoc *	document;	/* The document within which this is an anchor */
  char * 	address;	/* Absolute address of this node */
  HTFormat	format; 	/* Pointer to node format descriptor */
  BOOL		isIndex;	/* Acceptance of a keyword search */
  char *	title;		/* Title of document */
  
  HTList*	methods;	/* Methods available as HTAtoms */
  void *	protocol;	/* Protocol object */
  char *	physical;	/* Physical address */
  HTList *	cacheItems;	/* Cache hits (see HTFWriter) for this URL */
  long int      size;           /* Indicative size only if multiformat */ 
};

The HyperDoc is a undefined structure that the client can use to put in
necessary information of the graphical object the anchor represents
when the object is loaded and in memory. Remember that it's only parent
anchors that have a graphical object associated with them - not child
anchors, see

	http://info.cern.ch/hypertext/WWW/Architecture/Anchors.html

for more information on anchors.

You can have a look at how the Line Mode Browser defines it in
GridText.c. Here it is called HText, but it is the same as hyperdoc.

-- cheers --

Henrik Frystyk
eturn-Path: ja@lithnext.epfl.ch 
Return-Path: <ja@lithnext.epfl.ch>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA03202; Fri, 22 Jul 1994 13:29:02 +0200
Received: from sicmail.epfl.ch by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA07249; Fri, 22 Jul 1994 13:29:00 +0200
Received: from lithnext.epfl.ch by sicmail.epfl.ch with SMTP (PP) 
          id <13268-0@sicmail.epfl.ch>; Fri, 22 Jul 1994 13:28:57 +0200
Received: by lithnext.epfl.ch (NX5.67d/NX3.0M) id AA02469;
          Fri, 22 Jul 94 13:27:10 +0100
Date: Fri, 22 Jul 94 13:27:10 +0100
From: Jean Alexis Montignies <ja@lithnext.epfl.ch>
Message-Id: <9407221227.AA02469@lithnext.epfl.ch>
Original-Received: by NeXT.Mailer (1.100)
Pp-Warning: Illegal Received field on preceding line
Original-Received: by NeXT 
                   Mailer (1.100)
Pp-Warning: Illegal Received field on preceding line
To: www-lib@www0.cern.ch
Subject: Re: need to link a structure to each anchor
Reply-To: ja@lithnext.epfl.ch
content-length: 752

> The HyperDoc is a undefined structure that the client can use to put in
> necessary information of the graphical object the anchor represents
> when the object is loaded and in memory. Remember that it's only parent
> anchors that have a graphical object associated with them - not child
> anchors, see

Well, the problem is that i'm alreay using the field document to store the  
object wich hold the informations presented to the user.
Also, the 'ParentAnchor' can be created and linked to the HTParentAnchor  
structure even if there is no document loaded.

I should link this field to my 'ParentAnchor' object and then link my document  
to this object, but i also need to link each HTChildAnchor object to a  
'ChildAnchor' object.

Jean-Alexis
eturn-Path: connolly@hal.com 
Return-Path: <connolly@hal.com>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA22593; Fri, 2 Sep 1994 19:28:34 +0200
Received: from hal.COM by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA17323; Fri, 2 Sep 1994 19:28:30 +0200
Received: from ulua.hal.com by hal.com (4.1/SMI-4.1.1)
	id AA14182; Fri, 2 Sep 94 10:28:26 PDT
Received: from localhost by ulua.hal.com (4.1/SMI-4.1.2)
	id AA00867; Fri, 2 Sep 94 12:29:29 CDT
Message-Id: <9409021729.AA00867@ulua.hal.com>
To: www-talk@www0.cern.ch, www-lib@www0.cern.ch
Subject: Mail etc. syntax support in WWW _clients_
Date: Fri, 02 Sep 1994 12:29:27 -0500
From: "Daniel W. Connolly" <connolly@hal.com>
content-length: 809


In some cases, all this business of converting mail, gnu info, FAQs
etc. to html on servers adds value: folks tend to add database searching
support; they tend to make the stuff generally more accessible to WWW
clients while they're at it.

But wouldn't it make a lot more sense to support mail syntax and gnu
info syntax in WWW clients?

Every HTTP client already has code to parse RFC822 headers to deal
with HTTP return messages. The same code could be used to parse mail
messages from local disk, or from FTP servers, etc.

Support for viewing message/rfc822 (with recursive MIME body part handling...)
and gnu info format object should be part of libwww.

I've done message/rfc822 support in a protype client, so I know it can
work. And I know other folks have done info->html converters.

Hmmm...

Dan
eturn-Path: frystyk@ptsun00.cern.ch 
Return-Path: <frystyk@ptsun00.cern.ch>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA01063; Sat, 3 Sep 1994 01:11:49 +0200
Received: from ptsun00.cern.ch by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA22207; Sat, 3 Sep 1994 01:11:10 +0200
Received: from ptsun03.cern.ch by ptsun00.cern.ch (4.1/SMI-4.1)
	id AA16354; Sat, 3 Sep 94 01:10:09 +0200
From: frystyk@ptsun00.cern.ch (Henrik Frystyk Nielsen)
Received: by ptsun03.cern.ch (4.1/client-1.5)
	id AA01061; Sat, 3 Sep 94 01:11:15 +0200
Date: Sat, 3 Sep 94 01:11:15 +0200
Message-Id: <9409022311.AA01061@ptsun03.cern.ch>
To: www-lib@www0.cern.ch, www-talk@www0.cern.ch
Subject: Re: Mail etc. syntax support in WWW _clients_
content-length: 1401



Phil Hallam Baker and I are currently working on getting a new MIME
parser into the World-Wide Web Library of Common Code - it has been on
the working list for a long time.  The idea is to get it finished this
fall. Furthermore, I am experimenting with the stream stack so that
converting `raw' data from NNTP, FTP, Gopher etc. into HTML is done
just like all the other conversions as for example text/plain ->
text/html. Then, of cause, they can also be parsed as `*/*' so that the
client can get the original data sent from the remote servers.

This means basically that the library from being HTML based turns into
a generic protocol code base with a set of independent protocol
modules. A combination of this also gives the possibility of viewing
rfc822 messages - both HTML formatted and the original message.
Multipart, however will not get implemented right now - simply because
of the amount of other work to be done :-(


-- cheers --

Henrik Frystyk


> But wouldn't it make a lot more sense to support mail syntax and gnu
> info syntax in WWW clients?
> 
> Every HTTP client already has code to parse RFC822 headers to deal
> with HTTP return messages. The same code could be used to parse mail
> messages from local disk, or from FTP servers, etc.
> 
> Support for viewing message/rfc822 (with recursive MIME body part handling...)
> and gnu info format object should be part of libwww.
eturn-Path: hallam@alws.cern.ch 
Return-Path: <hallam@alws.cern.ch>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA27995; Sat, 3 Sep 1994 14:18:29 +0200
Received: from ALWS.DECnet MAIL11D_V3 by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA24023; Sat, 3 Sep 1994 14:18:28 +0200
Date: Sat, 3 Sep 1994 14:18:28 +0200
Message-Id: <9409031218.AA24023@dxmint.cern.ch>
From: hallam@alws.cern.ch
X-Vms-To: DXMINT::www-lib@www0.cern.ch
X-Vms-Cc: HALLAM
Subject: Re: Mail etc. syntax support in WWW _clients_
X-Mail11-Ostype: VAX/VMS
Apparently-To: <www-lib@www0.cern.ch>
content-length: 457

Hi,

	Just to back all this up, as I see it the clients should be also able
to read mail stored in folders etc. Fot VMS this is easy. For UNIX I am tempted
to support mh only because it is very well suited to WWW integration and the
standard UNIX mails (elm like) are not.

	I went though all the hassle of doing the work for this but haven't yet 
integrated to the library. Library integration of my synthesis systems is 
currently my main goal.


	Phill.
eturn-Path: JCMa@WILSON.AI.MIT.EDU 
Return-Path: <JCMa@WILSON.AI.MIT.EDU>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA05017; Wed, 14 Sep 1994 18:19:46 +0200
Received: from wilson.ai.mit.edu by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA26694; Wed, 14 Sep 1994 17:54:39 +0200
Received: from JEFFERSON.AI.MIT.EDU by WILSON.AI.MIT.EDU via INTERNET with SMTP id 11978; 14 Sep 1994 11:40:51-0400
Date: Wed, 14 Sep 1994 11:41-0400
From: John C. Mallery <JCMa@WILSON.AI.MIT.EDU>
Subject: Status Code For Overload
To: www-lib@www0.cern.ch
Cc: timbl@LCS.MIT.EDU
Message-Id: <19940914154141.7.JCMA@JEFFERSON.AI.MIT.EDU>
content-length: 345

I am adding a new server internal condition that is returned to
the client when the server is operating at capacity and cannot service
additional requests.

Talked to Tim and he thinks it's good idea.

What should the 5xxx number for it be?

502?

The string is: "Server Overloaded Now"

What is the status code for timeout? another 5xx please.
eturn-Path: hallam@alws.cern.ch 
Return-Path: <hallam@alws.cern.ch>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA23852; Wed, 14 Sep 1994 18:55:57 +0200
Received: from ALWS.DECnet MAIL11D_V3 by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA09502; Wed, 14 Sep 1994 18:55:52 +0200
Date: Wed, 14 Sep 1994 18:55:52 +0200
Message-Id: <9409141655.AA09502@dxmint.cern.ch>
From: hallam@alws.cern.ch
X-Vms-To: DXMINT::JCMa@WILSON.AI.MIT.EDU,dxmint::www-lib@www0.cern.ch
X-Vms-Cc: HALLAM
Subject: RE: Status Code For Overload
X-Mail11-Ostype: VAX/VMS
Apparently-To: <www-lib@www0.cern.ch>
Apparently-To: <JCMa@WILSON.AI.MIT.EDU>
content-length: 611

John,

502 looks OK, but shouldn't there be some header giving an indication of when
a good time to retry would be?

I would prefer the string "Insufficient Resources". Overload suggest
that its due to the amount of web trafic. It might be that the URL has been
analysed and the server has decided that that particular URL is too expensive
to retreive at the moment but another would be OK.

I'm thinking here about chunky searches and the like being barred during the 
hours of daylight and only light requests being allowed. Or alternatively
there might be a one user at a time type resource in use.


Phill.
eturn-Path: timbl@quag.lcs.mit.edu 
Return-Path: <timbl@quag.lcs.mit.edu>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA02297; Wed, 14 Sep 1994 19:16:02 +0200
Received: from hq.lcs.mit.edu by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA12073; Wed, 14 Sep 1994 19:15:57 +0200
Received: from quag.lcs.mit.edu by hq.lcs.mit.edu (4.1/NSCS-1.0S) 
	id AA04789; Wed, 14 Sep 94 13:15:51 EDT
Received: by quag.lcs.mit.edu (NX5.67d/NX3.0S)
	id AA02688; Wed, 14 Sep 94 13:14:00 -0400
Date: Wed, 14 Sep 94 13:14:00 -0400
From: Tim Berners-Lee <timbl@quag.lcs.mit.edu>
Message-Id: <9409141714.AA02688@quag.lcs.mit.edu>
Received: by NeXT.Mailer (1.100)
Received: by NeXT Mailer (1.100)
To: John C. Mallery <JCMa@wilson.ai.mit.edu>
Subject: HTTP Re: Status Code For Overload
Cc: www-lib@www0.cern.ch, timbl@lcs.mit.edu
Reply-To: Tim Berners-Lee <timbl@hq.lcs.mit.edu>
content-length: 1702


Henrik,

Let me elaborate a bit on John's message.
He wants a couple of new HTTP status codes put provisionally
into the spec.  I suggest you change
http://info.cern.ch/hypertext/WWW/Protocols/HTTP/HTRESP.html
to include them with a "Provisional -- under discussion"
comment.  The meaning sas I see them are, assuming the 

502 and 503 have not been taken,

Service overloaded now  502

	The server cannot process the request due to a
	high load (whether HTTP servicing or other requests).
	The implication is that this is a temporary condition
	which maybe alleviated at other times.

Gateway timeout 503  (Is this right John?)

	This is equivalent to Internal Error 500, but in the case of
	a server which is in turn accessing some other service, this
	indicates that the respose from the other service did not
	return within a time that the gateway was prepared to wait.
	As from the point of view of the clientand the HTTP
	transaction the other service is hidden within the server,
	this maybe treated identically to Internal error 500,
	but has more diagnostic value.

You can see from my wording that I feel that 502 is important
and 503 just useful.

Tim

Begin forwarded message:

Date: Wed, 14 Sep 1994 11:41-0400
From: John C. Mallery <JCMa@wilson.ai.mit.edu>
Subject: Status Code For Overload
To: www-lib@www0.cern.ch
Cc: timbl@lcs.mit.edu

I am adding a new server internal condition that is returned to
the client when the server is operating at capacity and cannot  
service
additional requests.

Talked to Tim and he thinks it's good idea.

What should the 5xxx number for it be?

502?

The string is: "Server Overloaded Now"

What is the status code for timeout? another 5xx please.

eturn-Path: JCMa@WILSON.AI.MIT.EDU 
Return-Path: <JCMa@WILSON.AI.MIT.EDU>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA11725; Wed, 14 Sep 1994 20:38:13 +0200
Received: from wilson.ai.mit.edu by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA23225; Wed, 14 Sep 1994 20:36:57 +0200
Received: from JEFFERSON.AI.MIT.EDU by WILSON.AI.MIT.EDU via INTERNET with SMTP id 12011; 14 Sep 1994 14:24:15-0400
Date: Wed, 14 Sep 1994 14:25-0400
From: John C. Mallery <JCMa@WILSON.AI.MIT.EDU>
Subject: 502 "Insufficient Resources"
To: hallam@alws.cern.ch
Cc: Henrik Frystyk Nielsen <frystyk@ptsun00.cern.ch>, www-lib@www0.cern.ch
In-Reply-To: <9409141655.AA09502@dxmint.cern.ch>
Message-Id: <19940914182504.2.JCMA@JEFFERSON.AI.MIT.EDU>
content-length: 1130

    Date: Wed, 14 Sep 1994 12:55 EDT
    From: hallam@alws.cern.ch

    John,

    502 looks OK, but shouldn't there be some header giving an indication of when
    a good time to retry would be?
ok 502 it is.

    I would prefer the string "Insufficient Resources". Overload suggest
    that its due to the amount of web trafic. It might be that the URL has been
    analysed and the server has decided that that particular URL is too expensive
    to retreive at the moment but another would be OK.

OK. "Insufficient Resources" it is.  We're going to be melted down next week
when our hack becomes real.

    I'm thinking here about chunky searches and the like being barred during the 
    hours of daylight and only light requests being allowed. Or alternatively
    there might be a one user at a time type resource in use.

This could also be applied there as well.  the immediate application to ask
people to try a again later because the server is running at capacity.

You don't see me here much because, anti-social that I am, I just write
everything from scratch in Lisp.  See my CL-HTTP paper for the latest sources.
eturn-Path: frystyk@ptsun00.cern.ch 
Return-Path: <frystyk@ptsun00.cern.ch>
Received: from dxmint.cern.ch by www0.cern.ch (5.x/SMI-4.0)
	id AA21536; Mon, 19 Sep 1994 11:01:58 +0200
Received: from ptsun00.cern.ch by dxmint.cern.ch (5.65/DEC-Ultrix/4.3)
	id AA23352; Mon, 19 Sep 1994 11:01:53 +0200
Received: from ptsun03.cern.ch by ptsun00.cern.ch (4.1/SMI-4.1)
	id AA18346; Mon, 19 Sep 94 11:00:53 +0200
From: frystyk@ptsun00.cern.ch (Henrik Frystyk Nielsen)
Received: by ptsun03.cern.ch (4.1/client-1.5)
	id AA03877; Mon, 19 Sep 94 11:02:17 +0200
Date: Mon, 19 Sep 94 11:02:17 +0200
Message-Id: <9409190902.AA03877@ptsun03.cern.ch>
To: www-lib@www0.cern.ch
Subject: Re: HTTP Re: Status Code For Overload
content-length: 1460

Hi

I have now added the two extra return codes to the specs at

	http://info.cern.ch/hypertext/WWW/Protocols/HTTP/HTRESP.html

I would also like a code for

	INVALID RESPONSE

if the server sends nonsense back to the client. Actually this code
is already in the library, but it has not been documented :-(

Comments???

-- cheers --

Henrik

> Let me elaborate a bit on John's message.
> He wants a couple of new HTTP status codes put provisionally
> into the spec.  I suggest you change
> http://info.cern.ch/hypertext/WWW/Protocols/HTTP/HTRESP.html
> to include them with a "Provisional -- under discussion"
> comment.  The meaning sas I see them are, assuming the 
> 
> 502 and 503 have not been taken,
> 
> Service overloaded now  502
> 
> 	The server cannot process the request due to a
> 	high load (whether HTTP servicing or other requests).
> 	The implication is that this is a temporary condition
> 	which maybe alleviated at other times.
> 
> Gateway timeout 503  (Is this right John?)
> 
> 	This is equivalent to Internal Error 500, but in the case of
> 	a server which is in turn accessing some other service, this
> 	indicates that the respose from the other service did not
> 	return within a time that the gateway was prepared to wait.
> 	As from the point of view of the clientand the HTTP
> 	transaction the other service is hidden within the server,
> 	this maybe treated identically to Internal error 500,
> 	but has more diagnostic value.
Received on Tuesday, 12 July 1994 13:11:49 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 23 April 2007 18:18:24 GMT