[ANN] Announcing a Calabash extension for text/binary files

Dear XProc'ers,

I just released an open-source Calabash extension that allows text/binary files to be parsed and manipulated in XProc. 


This project contains XProc extension steps that parse data into XML using the emerging Data Format Description Language standard (DFDL - http://www.ogf.org/dfdl).  DFDL describes a data format using a subset of XML Schema, allowing text and binary data to be parsed into an XML Infoset. This project integrates Daffodil (an open-source DFDL implementation) into Calabash.

Here's an example to whet your appetite, from the README.txt file: 

    <p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
        <p:output port="result"/>
        <p:import href="../../etc/daffodil-library.xpl"/>
        <dfdl:parse-file file="simpleCSV" schema="csv.dfdl.xsd"
            root="ex:file" xmlns:ex="http://example.com"/>

Here, a CSV is parsed using a sample CSV DFDL schema.  

Of course, the real power of this extension is in manipulating the XML after the data is parsed.  More examples to come on that front.

More details are available at the above URL in the README.txt file.  More details about Daffodil are available at https://opensource.ncsa.illinois.edu/confluence/display/DFDL/Daffodil%3A+Open+Source+DFDL.

Feedback is welcome and encouraged. 

Much thanks to Florent Georges for his very helpful blog post at

With best regards,

Jonathan W. Cranford 
Senior Information Systems Engineer
The MITRE Corporation (http://www.mitre.org)

P.S. Apologies for any duplicate posts; I had difficulties the first time I tried posting this from my gmail account.

Received on Saturday, 26 July 2014 19:12:48 UTC