BigOWLIM 3.3 in use by the BBC for the World Cup website from Atanas Kiryakov on 2010-06-22 (semantic-web@w3.org from June 2010)

From: Atanas Kiryakov <naso@sirma.bg>
Date: Tue, 22 Jun 2010 15:22:35 +0300
To: <semantic-web@w3.org>
Message-ID: <97CF923172424E1897B521C13F43BA63@sirma.int>
BigOWLIM 3.3 Handles Millions of Queries per Day, Supports OWL 2 RL 
Reasoning and Provides Unmatched Linked Data Integration, Management and 
Retrieval Capabilities

Ontotext is pleased to announce version 3.3 of BigOWLIM. The development of 
this version was influenced by the requirements of FactForge and 
LinkedLifeData (two of the most advanced linked data portals) and the BBC's 
2010 World Cup website - probably the most challenging real-world use case 
of semantic repositories implemented so far. Within the LarKC project 
(http://larkc.eu) BigOWLIM is used as the data layer in a platform for Web 
scale reasoning, which features a range of reasoning plug-ins, including the 
WebPIE massively parallel reasoning system [4].

The key characteristics and features of BigOWLIM include:
- *Pure Java* implementation and fully compatible with Sesame 2, which 
brings interoperability benefits and support for all popular RDF syntaxes 
and query languages, including SPARQL;
- *Clustering support* brings resilience, failover and horizontally scalable 
parallel query processing;
- Customisable reasoning, in addition to RDFS, OWL-Horst, and *OWL 2 RL* 
support [2];
- *Optimized owl:sameAs* handling, which delivers dramatic improvements in 
performance and usability when huge volumes of data from multiple sources 
are integrated;
- *Full-text search*, based on either Lucene or proprietary techniques;
- *High performance retraction* of statements and their inferences.
- Powerful and expressive *consistency checking* mechanisms;
- *RDF rank*, similar to Google's PageRank, can be calculated for the nodes 
in an RDF graph and used for ordering query results by relevance and any 
other purposes;
- *RDF Priming*, based upon activation spreading, allows efficient data 
selection and context-aware query answering for handling huge datasets;
- *Notification mechanism*, to allow clients to react to statements in the 
update stream.

These features are already proven at FactForge (http://FactForge.net), where 
BigOWLIM is used to load 8 of the central LOD [3] datasets (DBPedia, 
Geonames, Wordnet, Musicbrainz, Freebase, UMBEL, Lingvoj and the CIA World 
Factbook) in a repository which contains 1.2 billion explicit and 0.8 
billion implicit statements. BigOWLIM's owl:sameAs optimization allows 
FactForge to deal with 'only' 2 billion statements in its indices, while the 
number of distinct statements retrievable form the repository is 10 billion. 
This feature allows FactForge to deliver non-inflated query results, while 
the semantics of owl:sameAs is still fully accounted for during query 
evaluation. FactForge is a public service that allows users to perform RDF 
search, execute SPARQL queries, and to explore this data in real-time, 
adhering to its semantics. FactForge is the only system which provides a 
solution to the Modigliani test, defined at ReadWriteWeb as the tipping 
point of the Semantic Web [1].

BigOWLIM is also at the heart of the LinkedLifeData RDF warehouse 
(http://LinkedLifeData.com), which combines 25 of the most popular 
biomedical databases in a repository that contains more than 4 billion 
statements.

The latest version of the BigOWLIM repository has been successfully 
integrated into the high performance Semantic Web publishing stack powering 
the BBC's 2010 World Cup website, performing OWL reasoning with continuously 
changing data and handling millions of page requests per day.

Some popular namespace prefixes come predefined within BigOWLIM 3.3 in order 
to simplify query writing. Such as the prefixes for: the RDF, RDFS, and OWL 
schemata; all the prefixes for linked data namespaces used in FactForge; and 
prefixes for projects like Good Relations 
(http://www.heppnetz.de/projects/goodrelations/)

Furthermore, the OWLIM website (http://www.ontotext.com/owlim/) has been 
revised to include more relevant details, latest benchmark results, etc. For 
further information, please contact OWLIM-info@ontotext.com - we would be 
very pleased to hear from you.

Sometimes developing OWLIM feels like mountain climbing - each new 
achievement opens up new opportunities and challenges. We often think of 
OWLIM as a track-laying machine that extends the reach of the data railways, 
step by step, changing the data-economy of entire domains by allowing more 
and more complex data to be handled at lower cost.

BigOWLIM 3.3 is as robust and advanced as it is today, because of the 
numerous clients who believed in it, used it and provided us feedback. Using 
OWLIM you help us make it better and lay the track further!

The OWLIM team, June 2010

------
[1] The Modigliani Test for Linked Data: Results. Richard MacManus, 
ReadWriteWeb, 
http://www.readwriteweb.com/archives/the_modigliani_test_for_linked_data.php
[2] Implementations - OWL. http://www.w3.org/2007/OWL/wiki/Implementations
[3] Linking Open Data. W3C SWEO Community Project. 
http://esw.w3.org/SweoIG/TaskForces/CommunityProjects/LinkingOpenData
[4] OWL reasoning with WebPIE: calculating the closure of 100 billion 
triples. Urbani J., Kotoulas, S., Maaseen J., van Harmelen, F. & Bal, H. In 
Proceedings of the ESWC '10.
Received on Tuesday, 22 June 2010 12:24:18 UTC