RE: dynamic XProc? from Philip Fennell on 2010-06-03 (xproc-dev@w3.org from June 2010)

From: Philip Fennell <Philip.Fennell@marklogic.com>
Date: Thu, 3 Jun 2010 01:54:02 -0700
To: "Toman_Vojtech@emc.com" <Toman_Vojtech@emc.com>, "xproc-dev@w3.org" <xproc-dev@w3.org>
Message-ID: <D20C296D14127D4EBD176AD949D8A75A442628A2@EXCHG-BE.marklogic.com>

Vojtech,

> passing a sequence of documents with a special wrapper element 
> that says which input port the document should be passed to.

That's a nice, elegant solution.

I'm very fond of using the concept of a 'job-bag' structure that contains the data you wish to process plus metadata and/or instructions for subsequent steps. I've used it in many pipelining scenarios both prior to and since XProc was invented. XProc is ripe for using this, dare is say it 'pattern', for document processing. By pattern I learn towards the concept rather than trying to formalise some concrete structure for a job bag as it is probably one of the few examples where ad hoc XML structures are more applicable than trying to re-use existing structures with, may be, the exception of metadata terms which might as well be Dublin Core or similar well known vocabularies.


Regards

Philip Fennell

Consultant
Mark Logic Corporation


________________________________________
From: xproc-dev-request@w3.org [xproc-dev-request@w3.org] On Behalf Of Toman_Vojtech@emc.com [Toman_Vojtech@emc.com]
Sent: 02 June 2010 09:34
To: xproc-dev@w3.org
Subject: RE: dynamic XProc?

> -----Original Message-----
> From: xproc-dev-request@w3.org [mailto:xproc-dev-request@w3.org] On
Behalf Of
> Romain Deltour
> Sent: Tuesday, June 01, 2010 9:19 PM
> To: xproc-dev@w3.org
> Subject: Re: dynamic XProc?


> Calabash's cx:eval deal with this complexity by allowing to
> "multiplex" the several input/output ports to a single port on the
> eval step (read the doc for more details). I don't know how Calumet's
> implementation works.

For input and output documents, the emc:eval step in Calumet uses one
input port and one output port. For dynamic pipelines that have multiple
input ports, you can "multiplex" the data for the different input ports
by passing a sequence of documents with a special wrapper element that
says which input port the document should be passed to. Similarly, if
the dynamic pipeline has multiple output ports, you get a sequence of
documents where each document is wrapped in a wrapper that says which
output port the document appeared on.

Example:

  <emx:eval detailed-input="true">
    <p:input port="source">
      <p:inline>
        <emx:document> <!-- no port: use primary input port -->
          <doc>One</doc>
        </emx:document>
      </p:inline>
      <p:inline>
        <emx:document port="second-input"> <!-- use "second-input" -->
          <doc>Two</doc>
        </emx:document>
      </p:inline>
      <p:inline> <!-- no wrapper: use primary input port -->
        <doc>Three</doc>
      </p:inline>
    </p:input>

    <p:input port="pipeline">
      <p:inline>
        <p:declare-step version="1.0">
          <p:input port="first-input" sequence="true" primary="true"/>
          <p:input port="second-input"/>
          <p:output port="result/>
          ...
        </p:declare-step>
      </p:inline>
    </p:input>
  </emx:eval>

Regards,
Vojtech

--
Vojtech Toman
Principal Software Engineer
EMC Corporation
toman_vojtech@emc.com
http://developer.emc.com/xmltech

Received on Thursday, 3 June 2010 08:54:56 UTC