binary XML API and scientific use cases [Re: [xml-dev] [ANN] nux-1.0beta2 release

Wolfgang Hoschek wrote:

> This is to announce the nux-1.0beta2 release (http://dsd.lbl.gov/nux/).
>
> Nux is a small, straightforward, and surprisingly effective 
> open-source extension of the  XOM XML library.
>
hi Wolfgang,

the natural question is: how does it compare to XBIS?

can it be divorced from XOM?

in particular LGPL and Apache/BSD are not compatible (it seems nux is 
under BSD and XOM under LGPL ...).

> Features include:
>     •     Seamless W3C XQuery and XPath support for XOM, through Saxon.
>     •     Efficient and flexible pools and factories for XQueries, XSL 
> Transforms, as well as Builders that validate against various schema 
> languages, including W3C XML Schemas, DTDs, RELAX NG, Schematron, etc.
>     •     Serialization and deserialization of XOM XML documents to 
> and from  an efficient and compact custom binary XML data format (bnux 
> format), without loss or change of any information.
>     •     For simple and complex continuous queries and/or 
> transformations over very large or infinitely long XML input, a 
> convenient streaming path filter API combines full XQuery support with 
> straightforward filtering.
>     •     Glue for integration with JAXB and for queries over 
> ill-formed HTML.
>     •     Well documented API. Ships in a jar file that weighs just 60 
> KB.
>
> Changelog:
>
> XOM serialization and deserialization performance is more than good 
> enough for most purposes. However, for particularly stringent 
> performance requirements this release adds "bnux", an option for 
> lightning-fast binary XML serialization and deserialization. 

did you compare BNUX and XBIS performance?

> Contrasting bnux with XOM:
>
>     •     Serialization speedup: 2-7 (10-35 MB/s vs. 5 MB/s)
>     •     Deserialization speedup: 4-10 (20-50 MB/s vs. 5 MB/s)
>     •     XML data compression factor: 1.5 - 4
>
> For a detailed discussion and background see 
> http://dsd.lbl.gov/nux/api/nux/xom/binary/BinaryXMLCodec.html
>
XOM is tree model so how do you do streaming - it by streaming partial 
XOM tree construction/deconstruction when you access data (overriding 
|endElement()| in |NodeFactory|) and manually keep detach-ing() nodes or 
just letting them to be GCed?

what are use cases for nux: what do you plan to use it for?

are use cases related to XML Binary Characterization 
<http://www.w3.org/TR/xbc-use-cases/>?

i am a bit disappointed that scientific requirements are completely 
omitted form XBC use cases - the closest i could find is 
http://www.w3.org/TR/xbc-use-cases/#FPenergy but it skips over whole 
issue how to transfer array of doubles without changing endianess ...

we did lot of work in past related to XML performance (in Indiana 
University and Binghamton) and are very concerned that whatever binary 
XML will be characterized/standardized in W3C will be of no much use for 
scientific computing and grids ...

thanks,

alek

-- 
The best way to predict the future is to invent it - Alan Kay

Received on Monday, 22 November 2004 21:02:29 UTC