Re: Where's the parallelize step?

From: Henry S. Thompson <ht@inf.ed.ac.uk>
Date: Mon, 20 Apr 2009 13:01:07 +0100
To: "Costello, Roger L." <costello@mitre.org>
Cc: "'xproc-dev@w3.org'" <xproc-dev@w3.org>
Message-ID: <f5b8wlvh6v0.fsf@hildegard.inf.ed.ac.uk>
Costello, Roger L. writes:

> Why doesn't XProc have a parallelize step? That is, a step that
> enables two subpipelines to proceed in parallel. Is there any
> discussion of adding a parallelize step to XProc?

If I've understood you correctly, the answer is "because it doesn't
need one".  The semantics of XProc do not require the evaluation of
the steps in a pipeline to be any more serialised than their explicit
dependencies require.  So it's open to implementations to parallelise
as much as they like/can.

I have in the past used the following as a sort of _aide memoire_:

  It should be possible to implement XProc by starting separate
  threads for _every_ step in the controlling pipeline, and letting
  input/output/parameter ports control the actual order of execution.

I believe it is

  a) still the case that the above will work;

  b) implicit in the above that if you have multiple processors, you
     will get parallel execution where the above story allows for it.

There is at least one case where a smart implementation can
parallelise which the above would not immediately uncover.  The
execution of the sub-pipeline of a p:for-each should in principle be
parallelisable across the different inputs to the p:for-each (provided
they have no side-effects).

I can imagine a few extension attributes:

 1) (boolean)pext:no-side-effects
 2) (boolean)pext:output-reorder-ok

The former for any step, the latter for p:for-each, meaning the order
of the documents in its output sequence need not match that of their
corresponding input documents.  I'm not actually sure in practice if
this would help -- it might turn out to be more efficient, as well as
easier to implement, to require each (presumed independent) thread of
a parallelised for-each to buffer/suspend its output until the thread
for the document 'before' its own has completed.  It follows that the
benefit of parallelisation for p:for-each would be limited unless the
ratio of computation to output involved was high. . .

In any case, I think we have all the room we need to explore this
space w/o any explicit steps, but maybe James's archive search will
uncover something I've missed. . .

