W3C home > Mailing lists > Public > xproc-dev@w3.org > July 2014

[ANN] Announcing a Calabash extension for text/binary files

From: Jonathan Cranford <jonathan.w.cranford@gmail.com>
Date: Sat, 26 Jul 2014 12:48:41 -0600
Message-ID: <CAOBDn908a9j6uVxKOMTAeVuO0V7EaR=pctPOTZ_EuVYkbZjEZg@mail.gmail.com>
To: xproc-dev@w3.org
Cc: costello@mitre.org
Dear XProc'ers,

I just released a open-source Calabash extension that allows text/binary
files to be parsed and manipulated in XProc.

https://opensource.ncsa.illinois.edu/stash/projects/DFDL/repos/daffodil-calabash-extension/browse

This project contains XProc extension steps that parse data into XML using
the emerging Data Format Description Language standard (DFDL -
http://www.ogf.org/dfdl).  DFDL describes a data format using a subset of
XML Schema, allowing text and binary data to be parsed into an XML Infoset.
This project integrates Daffodil (an open-source DFDL implementation) into
Calabash.

Here's an example to whet your appetite, from the README.txt file:

> <p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
>                 xmlns:dfdl="urn:daffodil:calabash"
>                 version="1.0">
>     <p:output port="result"/>
>     <p:import href="../../etc/daffodil-library.xpl"/>
>     <dfdl:parse-file file="simpleCSV" schema="csv.dfdl.xsd"
>         root="ex:file" xmlns:ex="http://example.com"/>
> </p:declare-step>

Here, a CSV is parsed using a sample CSV DFDL schema.

Of course, the real power of this extension is in manipulating the XML
after the data is parsed.  More examples to come on that front.

More details are available at the above URL in the README.txt file.  More
details about Daffodil are available at
https://opensource.ncsa.illinois.edu/confluence/display/DFDL/Daffodil%3A+Open+Source+DFDL.


Feedback is welcome and encouraged.

Much thanks to Florent Georges for his very helpful blog post at
http://fgeorges.blogspot.com/2011/09/writing-extension-step-for-calabash-to.html
..

With best regards,
--
Jonathan Cranford
Received on Monday, 28 July 2014 13:19:59 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 23:17:20 UTC