W3C home > Mailing lists > Public > public-semweb-lifesci@w3.org > January 2012

ANN: LDIF - Linked Data Integration Framework Version 0.4 Scale-Out released.

From: Chris Bizer <chris@bizer.de>
Date: Mon, 16 Jan 2012 11:35:40 +0100
To: <public-lod@w3.org>, <semantic-web@w3.org>, <public-semweb-lifesci@w3.org>
Message-ID: <14a001ccd43a$97f8d300$c7ea7900$@bizer.de>
Hi all,

 

the Web-based Systems Group and our industry partner mes |semantics are
happy to announce the release of the LDIF – Linked Data Integration
Framework Version 0.4 Scale-Out.

 

LDIF can be used within Linked Data applications to translate heterogeneous
data from the Web of Linked Data into a clean local target representation
while keeping track of data provenance. LDIF translates data from the Web
into a consistent target vocabulary. LDIF includes an identity resolution
component which translates URI aliases into single target URI.

 

Up till now, LDIF stored data purely in-memory. This restricted the amount
of data that could be processed. 

 

LDIF Version 0.4 introduces two new implementations of the LDIF runtime
environment which allow LDIF to scale to large data sets: 

 

1. The new triple store backed implementation scales to larger data sets on
a single machine. 

2. The new Hadoop-based implementation provides for processing very large
data sets on a Hadoop cluster, for instance on Amazon EC2. 

 

We have tested LDIF for integrating RDF data sets ranging from 25 million to
3.6 billion triples. 
A comparison of the performance of all three implementations is found on the
LDIF benchmark page:

 

http://www.assembla.com/spaces/ldif/wiki/Benchmark

 

LDIF is provided under the terms of the Apache Software License. LDIF can be
downloaded from the project webpage which also provides detailed information
about the features and the configuration of the framework:

 

http://www4.wiwiss.fu-berlin.de/bizer/ldif/

 

The development of LDIF is supported in part by Vulcan Inc. as part of its
Project Halo and by the EU FP7 project LOD2 (Grant No. 257943).

 

Lots of thanks to 

   + Andreas Schultz and Andrea Matteini who did most of the implementation
and benchmarking work as well as

   + Christian Becker and Robert Isele who also contributed to the release.

 

Cheers,

 

Chris

 

 

--

Prof. Dr. Christian Bizer

Web-based Systems Group

Freie Universitšt Berlin

+49 30 838 55509

 <http://www.bizer.de> http://www.bizer.de

 <mailto:chris@bizer.de> chris@bizer.de

 
Received on Monday, 16 January 2012 10:38:17 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 18:01:05 GMT