W3C home > Mailing lists > Public > xproc-dev@w3.org > July 2014

[ANN] Announcing a Calabash extension for text/binary files

From: Cranford, Jonathan W. <jcranford@mitre.org>
Date: Sat, 26 Jul 2014 19:12:20 +0000
To: "xproc-dev@w3.org" <xproc-dev@w3.org>
CC: "Costello, Roger L." <costello@mitre.org>
Message-ID: <A0879CE1FEEE1242B002E148D34678DF24E0CE33@IMCMBX02.MITRE.ORG>
Dear XProc'ers,

I just released an open-source Calabash extension that allows text/binary files to be parsed and manipulated in XProc. 


This project contains XProc extension steps that parse data into XML using the emerging Data Format Description Language standard (DFDL - http://www.ogf.org/dfdl).  DFDL describes a data format using a subset of XML Schema, allowing text and binary data to be parsed into an XML Infoset. This project integrates Daffodil (an open-source DFDL implementation) into Calabash.

Here's an example to whet your appetite, from the README.txt file: 

    <p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
        <p:output port="result"/>
        <p:import href="../../etc/daffodil-library.xpl"/>
        <dfdl:parse-file file="simpleCSV" schema="csv.dfdl.xsd"
            root="ex:file" xmlns:ex="http://example.com"/>

Here, a CSV is parsed using a sample CSV DFDL schema.  

Of course, the real power of this extension is in manipulating the XML after the data is parsed.  More examples to come on that front.

More details are available at the above URL in the README.txt file.  More details about Daffodil are available at https://opensource.ncsa.illinois.edu/confluence/display/DFDL/Daffodil%3A+Open+Source+DFDL.

Feedback is welcome and encouraged. 

Much thanks to Florent Georges for his very helpful blog post at

With best regards,

Jonathan W. Cranford 
Senior Information Systems Engineer
The MITRE Corporation (http://www.mitre.org)

P.S. Apologies for any duplicate posts; I had difficulties the first time I tried posting this from my gmail account.
Received on Saturday, 26 July 2014 19:12:48 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 23:17:20 UTC