W3C home > Mailing lists > Public > xproc-dev@w3.org > May 2010

Re: Memory usage

From: Alex Muir <alex.g.muir@gmail.com>
Date: Mon, 10 May 2010 08:21:03 +0000
Message-ID: <AANLkTikZ2xCDJCOdkiFH8nU5BJLqQsNBJ7tjGfhD-p4x@mail.gmail.com>
To: Norman Walsh <ndw@nwalsh.com>
Cc: XProc Dev <xproc-dev@w3.org>
Norm..  just wondering does p:sink call that saxon:discard-document()
function in calabash?

Should a for loop that reads in files apply xslt and writes out files
have a p:sink at the end of the loop to reclaim memory?

Regards
Alex


On Mon, May 10, 2010 at 7:21 AM, Philip Fennell
<Philip.Fennell@marklogic.com> wrote:
> Nic,
>
> Just a suggestion, but, depending on which version of Calabash, and therefore Saxon that you are using you may have access to the Saxon extension functions. If so, you could try weaving into you transforms the saxon:discard-document() function. I have used it in the past for dealing with transforming many large documents. I know this is not necessarily treating the problem but it may act as a workaround for now.
>
>
> Regards
>
> Philip Fennell
>
> ________________________________________
> From: xproc-dev-request@w3.org [xproc-dev-request@w3.org] On Behalf Of Nic Gibson [nicg@corbas.net]
> Sent: 07 May 2010 11:22
> To: XProc Dev
> Subject: Memory usage
>
> We're seeing an XProc script through Calabash that shows increasing memory
> usage over time. I suspect that this is to be expected under the circumstances
> but I wanted to check and see if anyone can suggest a mitigating action.
>
> The script takes and XML file containing (basically) a list of file
> URLs. Each of these URLs is a directory on the local filesystem. All XML
> files in each directory are read using p:load then transformed using
> several XSLT pipelines. The whole script is basically two big nested
> p:for-each loops (one to read directories and a nested one to read
> and process the files found)
>
> As this runs the memory usage goes up for each file loaded and, eventually,
> the jvm kills the process with a heap exhaustion error.
>
> I suspect that there is nothing in the script above that might indicate
> to calabash that any file can be discarded so each one is held in memory until
> the end of the script. Is that likely? I'm not exactly a skilled Java
> programmer so I'm not in a position to read the code.
>
> Can anyone see any sensible approach that might allow us to run this
> script over several thousand XML file when it currently dies after around
> nine?
>
> cheers
>
> nic
>



-- 
Alex

An informal recording with one mic under a tree leads to some pretty
sweet acoustic sounds.
https://sites.google.com/site/greigconteh/albums/diabarte-and-sons
Received on Monday, 10 May 2010 08:21:39 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 10 May 2010 08:21:39 GMT