Re: How to save the copy of source data & RE: stream return codes in HTTee from Henrik Frystyk Nielsen on 1996-01-19 (www-lib@w3.org from January to March 1996)

From: Henrik Frystyk Nielsen <frystyk@w3.org>
Date: Fri, 19 Jan 1996 10:40:38 -0500
To: Maciej Puzio <puzio@zodiac1.mimuw.edu.pl>
Cc: "'Henrik Frystyk Nielsen'" <frystyk@w3.org>, "'WWW Library Mailing List'" <www-lib@w3.org>
Message-Id: <9601191540.AA05346@www20>
Maciej Puzio writes:

> > > But the upstream module can then check for both codes and then decide what to 
> > > do... The important thing is that both codes (or in general all codes in case 
> > > of multiple T streams) get propagated upstream.
> > >
> > > The resolving function could also be a direct part of the 
> > > upstream module which will be activated each time the T stream returns. As I 
> > > see it, both ways will do, but maybe a callback function is a more flexibel 
> > > solution.
> >
> > I've got a point of view on it, but I need to think a little.
> > I'll write to you soon.
> 
> I've just thought a little. I've also found another problem which is interesting by
> itself, but is also a good argument in our discussion about HTTee.
> 
> BTW: I'm not sure whether the HTTee problem deserves such a long discussion.
> It's getting more and more interesting for me, but if you are bored or don't have time,
> please let me know. This will save work and time for both of us. :-)

Actually I think that your solution about having a call back function that can 
solve any return code conflict is a good idea - patches are welcome ;-)
 
> The problem:
> 
> In my simple WWW browser all HTML documents were handled by the simple
> stream stack, which led from the network throught the MIME parser to the HTML 
> module presenting the document to the user. This stream had only one Tee, 
> before the MIME parser, which pushed the copy of the data to the cache. One day 
> however I decided to implement the "Save as HTML" menu command in my 
> browser. For that I needed a copy of the unparsed (but without headers) source 
> data. I didn't want to rely on the data from the cache, because the user of my 
> browser can disable it.
> 
> Below I give the solution to the problem. I'm not sure whether there is no simpler
> solution. Perhaps this can be done in some really trivial way? 

Well, if you want multiple outputs at the same time then the T stream is the 
only solution.
 
> The solution:
> 
> What I needed was a Tee after the MIME parser. But how to insert it to the stream
> stack which is created automatically according to the set of converters assigned to
> the request? To do that I created the "converter tee".
> 
> HTStream* HTMLPresentAndSave ( 
>         HTRequest*	request, 
>         void*		param, 
>         HTFormat		input_format, 
>         HTFormat		output_format, 
>         HTStream*		output_stream) 
> { 
>     return HTTee 
>         (HTMLPresent (request, param, input_format, output_format, output_stream), 
>          HTSaveAndCallBack (request, param, input_format, output_format,
>         		output_stream));
> }
> 
> Then I used this new converter instead of HTMLPresent:
> 
>     HTConversion_add(c,"text/html", "www/present", HTMLPresentAndSave,  
>     	1.0, 0.0, 0.0);
> 
> I did the similar trick for the HTPlainPresent converter.
> 
> BTW: I used here HTSaveAndCallBack to save the copy of the source data to
> the file. I know it has been removed from the library, but I reintroduced it in my copy.
> This converter is really useful. I know that using streams everywhere is better than
> passing data through the file, but the operating system doesn't know this and 
> in some case requires the file name. HTSaveAndCallBack is also useful as a
> temporary file writer (in this role I used it above, HTFWriter is not good since it's
> not a converter).

I know - it was a bit out of lack of time that I temporaryly disabled it when 
I isolated the cache mechanism. I thought it more important to get the cache 
out of the core than to preserve it at the moment, but I will put it back in 
(in a cache independent implementation).

> Instead of writing to a file I could have used the HTXParse converter, which would
> put the data in the memory buffer.
> 
> This solution works perfectly, but has some drawbacks:
> 
> 1. I'm not able to write a generic "converter tee". That is, I has to define very
> similar converter tee functions for every pair of converters I want to be tee'ed.
> This is beacuse I can only give a function pointer in HTConversion_add and
> I can't pass the converter any parameters.

This is a design limitation of the streams. Passing a parameter would be very 
useful to destinguish slightly different streams. I have thought of that for 
some time but didn't get it done.

> 2. I don't have any access to the results of the conveter's work. For example, the
> only way I can get the file name used for saving the data by HTSaveAndCallBack
> is the callback function I assign to the request. This causes that I can
> have only one callback function per request. In my application I use only one
> HTSaveAndCallBack per request, so this doesn't matter, but it's possible to
> have several HTSaveAndCallBacks used for different purposes in one request.
> What's then? It's even worse in the HTXParse case. To return the pointer to the
> memory buffer it uses only one callback function defined in the HTEPtoCl module
> (so we can only define one callback function in the whole application).
> 
> How to improve this?
> 
> My proposition:
> 
> Converters are now functions returning streams. I propose to make them objects.
> 
> Below I used the C++ code to make my ideas more readable. Of course, it has
> to be converted to plain C before using it in the library. Please consider the following
> as the pseudocode.
> 
> class HTConverter
> {
> public:
> 	virtual HTStream* CreateStream (
> 	        HTRequest*	request, 
> 	        void*		param, 
> 	        HTFormat		input_format, 
> 	        HTFormat		output_format, 
> 	        HTStream*		output_stream) = 0;
> };

plus of course the extra parameter ;-)
 
> Some examples of derived classes:
> 
> class HTSaveAndCallBack : public HTConverter
> {
> public:
> 	HTSaveAndCallBack()  { ...initialize... }
> 
> 	HTStream* CreateStream ( ...parameters as above... )
> 		{ ...do what HTSaveAndCallBack function does now... }
> 
> 	//callback for the stream, we can also make it a parameter to the constructor
> 	virtual void OnStreamClose (HTRequest* request, char* filename);
> 
> };
> 
> class HTConverterTee : public HTConverter
> {
> 	HTConverter* converter1;
> 	HTConverter* converter2;
> public:
> 	HTConverterTee (HTConverter* conv1, HTConverter* conv2)
> 		{ ...initialize... }
> 
> 	HTStream* CreateStream ( ...parameters as above... )
> 		{ return HTTee (
> 			conv1->CreateStream ( ...arguments... ),
> 			conv2->CreateStream ( ...arguments... )) }
> 
> 	//callback to resolve Tee result codes, we can also make it a parameter
> 	//to the constructor
> 	int ResolveResults (int result1, int result2)
> 		{ ...default implementation... }
> };
> 
> These converter objects would be created before assigning to the request and
> would be destroyed together with the request.
> 
> Drawback: this looks beautiful in C++, but in C? I'd rather not to think. :-)

Very nice idea - I'll have to think a bit more about how to "translate" it to 
C. Another way is of course to swap to C++. I would very much like some input 
on how that would be received.

> Now let me go back to the HTTee result codes problem.
> The "converter tee" is an example of the situation when the code pumping data
> to the stream doesn't know whether it's pumping it to a Tee or to something else.
> That's why it can't resolve which result is more important and what to do if it gets
> success and failure together. That's the argument for passing the resolving function 
> to a tee creation function as a parameter (callback). It also would be passed to the
> converter tee (as a callback). In the example above it is a virtual member function 
> (to override), but in C it will be difficult to program, I think.
> 
> Thanks a lot if you've managed to get to this line!  :-)

Thanks for the many good suggestions - it is really useful!


-- 

Henrik Frystyk Nielsen, <frystyk@w3.org>
World-Wide Web Consortium, MIT/LCS NE43-356
545 Technology Square, Cambridge MA 02139, USA
Received on Friday, 19 January 1996 10:41:12 UTC