- From: Innovimax W3C <innovimax+w3c@gmail.com>
- Date: Mon, 30 Apr 2012 18:15:18 +0200
- To: Norman Walsh <ndw@nwalsh.com>
- Cc: public-xml-processing-model-wg@w3.org
- Message-ID: <CAAK2GfEZbM+adv6mPSidb+hUm6JVJ4HtszgDT=tV-vpiXBNRXA@mail.gmail.com>
We can also add NVDL as a use case for "*" outputs and "*" inputs Mohamed On Thu, Apr 26, 2012 at 3:31 PM, Norman Walsh <ndw@nwalsh.com> wrote: > Per my action from last week... > > Part of my plan for (re)implementing my XProc processor involves performing > more aggressive graph analysis. This has two benefits: first, I'll be able > to establish thread boundaries and do multi-threaded processing and second, > I'll be able to identify (sub)pipelines that can be streamed. > > In order to make the graph more amenable to this sort of streaming and > rewriting, I'm transforming the user's pipeline into something with > explicit steps for actions like splitting. > > Consider this pipeline fragment: > > <p:identity name="root"/> > > <p:identity name="branch1"> > <p:input port="source"> > <p:pipe step="root" port="result"/> > </p:input> > </p:identity> > > <p:identity name="branch2"> > <p:input port="source"> > <p:pipe step="root" port="result"/> > </p:input> > </p:identity> > > The two identity steps branch1 and branch2 both read from the same > "result" port on the "root" step. At an implementation level that requires > some sort of buffering or copying. I want to make that explicit, so > I'm introducing an explicit split step: > > <p:identity name="root"/> > > <internal:split name="ID00001"> > > <p:identity name="branch1"> > <p:input port="source"> > <p:pipe step="ID00001" port="result1"/> > </p:input> > </p:identity> > > <p:identity name="branch2"> > <p:input port="source"> > <p:pipe step="ID00001" port="result2"/> > </p:input> > </p:identity> > > So what's the declaration for the internal:split step? It's something > like this: > > <p:declare-step type="internal:split"> > <p:input port="source" sequence="true" primary="true"/> > <p:output port="result1" sequence="true" primary="false"/> > <p:output port="result2" sequence="true" primary="false"/> > </p:declare-step> > > And I could declare internal:split2, internal:split3, etc. steps. But > really this is just a magic step with an arbitrary number of output > ports. > > The same problem exists if you want to write an eval step: > > <p:declare-step type="cx:eval"> > <p:input port="pipeline"/> > <p:input port="source" sequence="true"/> > <p:input port="options"/> > <p:output port="result"/> > <p:option name="step" cx:type="xsd:QName"/> > <p:option name="detailed" cx:type="xsd:boolean"/> > </p:declare-step> > > This is a step that takes *an XML pipeline document* as it's input, > compiles it, and runs it. The problem, of course, is that the number > of inputs and outputs that this step needs is determined entirely by > the input pipeline which isn't known at compile-time and may actually > be different on every invocation. > > I work around this in XML Calabash by encoding the multiple inputs > and outputs into a single document. That works (sortof) for XProc 1.0 > because the documents all have to be XML. It won't work at all if > we allow non-XML documents. > > (No, a sequence of inputs and outputs isn't sufficient because you > have to be able to map sequences of inputs and outputs to different > port names.) > > Be seeing you, > norm > > -- > Norman Walsh > Lead Engineer > MarkLogic Corporation > Phone: +1 413 624 6676 > www.marklogic.com > -- Innovimax SARL Consulting, Training & XML Development 9, impasse des Orteaux 75020 Paris Tel : +33 9 52 475787 Fax : +33 1 4356 1746 http://www.innovimax.fr RCS Paris 488.018.631 SARL au capital de 10.000 €
Received on Monday, 30 April 2012 16:15:49 UTC