W3C home > Mailing lists > Public > semantic-web@w3.org > April 2012

Re: Announcing OWLIM 5.0 - with new transaction mechanism, performance improvements, SPARQL 1.1 graph store protocol and more

From: Jeen Broekstra <jeen.broekstra@gmail.com>
Date: Sat, 21 Apr 2012 10:51:04 +1200
Message-ID: <CANyF_kGMVVxntZTi52sSsco0fQX4zARAqZpRoXK+3XdCnLDC-w@mail.gmail.com>
To: Barry Bishop <barry.bishop@ontotext.com>
Cc: Sesame discussion list <sesame-general@lists.sourceforge.net>, OWLIM-discussion@ontotext.com, gate-developers@lists.sourceforge.net, public-lod@w3.org, semantic-web@w3.org, soa4all@lists.atosresearch.eu, Ontoteam <onto_team@sirma.bg>, seals-news@listas.fi.upm.es, ict-larkc@lists.sti2.at
That is an impressive list of new features Barry. Congratulations to the
OWLIM dev team with this new release!

Cheers,

Jeen
this message brought to you by Toronto airport free wifi :)
On Apr 20, 2012 8:47 AM, "Barry Bishop" <barry.bishop@ontotext.com> wrote:

>  Ontotext are pleased to announce the release of OWLIM version 5.0<http://www.ontotext.com/owlim>featuring a new transaction mechanism, performance improvements, SPARQL 1.1
> graph store protocol, integration with TopBraid Composer/Live<http://www.topquadrant.com/products/TB_Suite.html>and many other improvements. The single most important new feature is the
> new transaction management mechanism which allows for much *more reliable
> and efficient handling of workloads where queries from multiple clients are
> combined with frequent updates* of the data. As benchmark results<http://www.ontotext.com/owlim/benchmark-results/owlim-5>demonstrate, OWLIM 5.0 is
> *43% faster* than v.4.3 on the BSBM Explore and Update<http://www4.wiwiss.fu-berlin.de/bizer/BerlinSPARQLBenchmark/spec/>scenario. As a result of several changes in the index structures, OWLIM now
> requires *between 25% and 70% less storage space*.
>
> Some of the most important improvements are listed below:
>
>    - *Transaction management and isolation mechanisms* have been
>    completely refactored. The previous strategy used lazy writing of modified
>    database pages, such that dirty pages were only flushed to disk when
>    further updates occur and no more memory is available. While extremely
>    fast, the problem with this approach is that there is a considerable
>    recovery time associated with replaying the transaction log after an
>    abnormal termination. The new mechanism uses two modes: 'bulk-loading'
>    (fast) with similar behaviour to previous versions and 'normal' (safe)
>    where database modifications are flushed to disk as part of the commit
>    operation. When running in safe mode, *database recovery is instant*and there is a
>    *significant improvement in concurrency between updates and queries*.
>
>
>    - *New context indices* can be used to improve query performance when
>    data is modelled using many named graphs. These are switched on and off
>    using a single configuration parameter enable-context-index
>
>
>    - The *SPARQL 1.1 Graph Store HTTP Protocol* is now supported
>    according to the W3C Working Draft<http://www.w3.org/TR/sparql11-http-rdf-update/>from the 12th May 2011. This provides a REST interface for managing
>    collections of graphs, using either directly or indirectly named graphs.
>
>
>    - *Sesame <http://www.openrdf.org>* *2.6.5* with many bug-fixes and
>    updates to bring SPARQL 1.1 Query<http://www.w3.org/TR/2012/WD-sparql11-query-20120105/>support up to the latest W3C Working Draft from the 5th January 2012.
>
>
>    - *Significant reduction in disk-space requirements* is achieved with
>    the following modifications:
>       - *Index compression* can now be used to reduce disk storage
>       requirements by using zip compression on database pages. This feature if
>       off by default, but can be switched on when creating a new repository. The
>       configuration parameter index-compression-ratio can be set to -1
>       (the default value indicating no compression) or a value in the range
>       10-50<https://confluence.ontotext.com/pages/createpage.action?spaceKey=OWLIMint&title=10-50&linkCreation=true&fromPageId=17596523>indicating the desired percentage reduction in page sizes. Any pages that
>       can not be compressed by the specified amount are stored uncompressed.
>       Therefore a compression ratio that is too aggressive will not bring many
>       benefits. Experiments have shown that for large datasets a value of about
>       30% is close to optimal and leads to a total disk space saving of around
>       50%.
>       - *Restructuring of the triple indices* has also led to a reduction
>       in disk-space requirements of around 18% independent of the compression
>       functionality
>       - *Entity compression* is a modification that reduces the storage
>       requirements for the lookup table that maps between internal identifiers
>       and resources. This is transparent to the user and happens automatically.
>       More disk space reductions are apparent using this version.
>
>
>    - A new *literal index* is created automatically for numeric and
>    date/time data-types. The index is used during query evaluation if a query
>    or a sub-query (e.g. union) has a filter that is comprised of a conjunction
>    of literal constraints, e.g. FILTER(?x >= 3 && ?y <= 5 && ?start >
>    "2001-01-01"^^xsd:date). Other patterns, including those that use negation,
>    will not use the index for this version of OWLIM.
>
>
>    - Tighter integration with TopQuadrant <http://www.topquadrant.com/>'s TopBraid
>    Composer <http://www.topquadrant.com/products/TB_Composer.html> (a
>    graphical development environment for modelling data) and TopBraid Live<http://www.topquadrant.com/products/TB_Live.html>(an enterprise SOA-capable Semantic Web application platform). Contact the OWLIM
>    team directly <owlim-info@ontotext.com> for details of how to obtain
>    the OWLIM plug-in.
>
>
>    - All *control queries now use SPARQL Update syntax* (used mostly to
>    control the Lucene-based full-text search, RDF Rank and geo-spatial
>    plug-ins). This has a number of advantages, namely:
>       - No special control query pseduo-graph is required by the
>       Replication Cluster master in order to identify control queries that must
>       be pushed to all worker nodes
>       - SPARQL Updates use the corresponding SPARQL update protocol, so
>       they can be automatically processed by load-balancers that examine URL
>       patterns
>       - It is more consistent with the SPARQL language, since these
>       'control queries' cause a change of state in OWLIM
>
>
>    - *Incremental Lucene-based full-text search index* for updating the
>    index for specific resources or all un-indexed resources. Using this
>    technique can avoid the more expensive approach of rebuilding the whole
>    index frequently.
>
>
>    - *Incremental RDF Rank* allows the RDF rank for specific resources to
>    be (re-)computed as directed by the user. This technique can avoid the more
>    expensive approach of rebuilding all RDF Rank values frequently.
>
>
>    - As well as the cache/index statistics, *performance analysis data*is now provided about currently executing queries including: how many
>    results have been returned so far, how long it has been executing, average
>    time to return each result, etc.
>
>
>    - The *getting started* application has been restructured so that it
>    now works with remote repositories.
>
> *Known problems with OWLIM 5.0*
>
>    - The behaviour of the 'include inferred' checkbox in the Sesame
>    Workbench is unpredictable when using OWLIM repositories.
>    - This version of OWLIM is *not backwardly compatible* with any
>    previous version. This means that images created with OWLIM 4.3 and before
>    will not work correctly with OWLIM 5.0 and must be re-created. There have
>    been a great many modifications to the storage files, indexing structures,
>    etc, and upgrade mechanisms have proven too complex and probably slower
>    than re-loading the database anyway. Please *do not attempt to upgrade
>    to OWLIM 5.0 unless you drop and recreate all databases*. A migration
>    tool, which allows for automated re-loading of data from any
>    Sesame-accessible repository, is provided to ease the transition.
>
> For further technical information and references to resolved technical
> issues, please refer to the Release notes<http://owlim.ontotext.com/display/OWLIMv50/OWLIM-SE+Release+notes>of the corresponding edition of OWLIM. Full documentation for all OWLIM
> editions is available online <http://owlim.ontotext.com> (click on the
> OWLIM 5.0 link on the left hand side).
>
> One can request further information and evaluation licences for OWLIM from
> here <http://www.ontotext.com/owlim#download>.
>
> The OWLIM team
> April 2012
>
>
Received on Friday, 20 April 2012 22:51:39 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:48:35 UTC