RE: Streaming OWL Parsers from Peter Crowther on 2003-02-14 (www-webont-wg@w3.org from February 2003)

From: Peter Crowther <Peter.Crowther@networkinference.com>
Date: Fri, 14 Feb 2003 23:08:46 -0000
To: "WebOnt WG" <www-webont-wg@w3.org>
Message-ID: <3BE4D3F0FB726240966DEF40418472B5012CBD@ni-lon-server1.ad.networkinference.com>

> From: Sean Bechhofer 
> But what happens when I've got an ontology
> with 10^8 concepts/individuals in it and I want to do some simple
> processing on it, that doesn't necessarily warrant me 
> building the whole data structure?

As I mentioned at the Manchester f2f, this is exactly the problem we
have at NI.  In our case, the simple processing is obtaining the OWL
statements, as distinct from the RDF statements, so that we can load
them into a reasoner.  A 70,000 concept file takes 1.5G of RAM when
loaded into Jena (1.6), about 5-10 times the size of our internal
structures.  We don't want to use a RDB simply to store the triples
during the load, as this would slow us down by orders of magnitude.

This particular customer wanted to give us OWL/RDF dumps with multiple
tens of millions of concepts, exported in arbitrary order.

We can stream OWL/XML, but not OWL/RDF, due to this annoying problem
that the triples describing a single concept nested to an arbitrary
depth may appear in an arbitrary order.  We cannot find any way round
this given the (intentional) limitations of RDF.  Unfortunately, Sean, I
think you're out of luck.

		- Peter

Received on Friday, 14 February 2003 18:09:19 UTC