- From: Alex Muir <alex.g.muir@gmail.com>
- Date: Tue, 11 May 2010 09:31:28 +0000
- To: Toman_Vojtech@emc.com
- Cc: xproc-dev@w3.org
This is really interesting and clearly the frame of reference that developers have not yet had time to optimize important. With the case of a for loop that reads input files, processes and stores output for each file and creates a result set of the files, would an implementation be able to easily determine that memory is not required in all cases given for example that a developer could be storing data and/or merging it? Would it be easy until such time that optimizations are completed to create a <reclaim memory> step of some kind which perhaps reclaims the memory of a given port and sinks? <p:sinkMem> <p:declare-step> <p:input port="source"> <p:output port="result" sequence="true"/> <p:directory-list> <p:for-each name="forEachFile"> <p:xslt version="1.0" name="LoadData"> <p:xslt version="1.0" name="Processing"> <p:store name="store"> <p:identity> <p:input port="source"> <p:pipe step="store" port="result"/> </p:input> </p:identity> <p:sinkMem> </p:for-each> <p:documentation>Wrap result XML </p:documentation> <p:wrap-sequence wrapper="forEachFile"/> <p:identity/> </p:declare-step> Regards Alex On Mon, May 10, 2010 at 9:16 AM, <Toman_Vojtech@emc.com> wrote: >> That's pretty well as far as I had worked out - whilst the memory > problem >> I've been seeing is *irritating*, it can't really be described as > simply >> wrong (complexly wrong perhaps) because there is nothing simple that > can >> be used to determine if a file loaded using p:load can be discarded. > > After the static analysis phase, the XProc processor has a pretty good > picture of what the connections in the pipeline look like and when/where > results of XProc steps are used (if they are used at all). That > knowledge alone can be used for various memory optimizations. For > instance, in a pipeline like this one: > > <p:pipeline> > <step1/> > <step2/> > </p:pipeline> > > the result of step1 can almost certainly be discarded after step2 is has > finished because there is no other step that refers to the result of > step1. > > Another thing is scoping of steps. By wrapping a step in, for example, a > p:group, you can very easily restrict the visibility of the results > produced by the steps in the sub-pipeline: > > ... > <p:group> > <step1/> > <step2/> > </p:group> > ... > > In the above example, the results produced by step1 can be discarded > once p:group has finished because they will be in an inaccessible scope. > > And so on. There are many optimizations that XProc processors can do, > but I think the implementers are just entering this stage after having > implemented the standard. For instance, EMC's Calumet (which I am > involved with) does not yet detect when the result of a step is not used > any more, but it does release the documents when they become out of > scope. > > Regards, > Vojtech > > -- > Vojtech Toman > Principal Software Engineer > EMC Corporation > toman_vojtech@emc.com > http://developer.emc.com/xmltech > > > -- Alex An informal recording with one mic under a tree leads to some pretty sweet acoustic sounds. https://sites.google.com/site/greigconteh/albums/diabarte-and-sons
Received on Tuesday, 11 May 2010 09:32:01 UTC