RE: Streaming OWL Parsers from Jos De_Roo on 2003-02-15 (www-webont-wg@w3.org from February 2003)

From: Jos De_Roo <jos.deroo@agfa.com>
Date: Sat, 15 Feb 2003 02:40:40 +0100
To: Peter.Crowther@networkinference.com
Cc: "WebOnt WG" <www-webont-wg@w3.org>, www-webont-wg-request@w3.org
Message-ID: <OFCBB195EC.4E9F36FB-ONC1256CCE.00070E4F-C1256CCE.0009434E@agfa.be>

>> From: Sean Bechhofer
>> But what happens when I've got an ontology
>> with 10^8 concepts/individuals in it and I want to do some simple
>> processing on it, that doesn't necessarily warrant me
>> building the whole data structure?
>
>As I mentioned at the Manchester f2f, this is exactly the problem we
>have at NI.  In our case, the simple processing is obtaining the OWL
>statements, as distinct from the RDF statements, so that we can load
>them into a reasoner.  A 70,000 concept file takes 1.5G of RAM when
>loaded into Jena (1.6), about 5-10 times the size of our internal
>structures.  We don't want to use a RDB simply to store the triples
>during the load, as this would slow us down by orders of magnitude.

>This particular customer wanted to give us OWL/RDF dumps with multiple
>tens of millions of concepts, exported in arbitrary order.

We have a testcase with a blob component that can
deliver tens of millions of ***triples*** and I don't
experience any scalabilty problems with that, to the contrary,
it's much easier/faster to have distributed graph control.


>We can stream OWL/XML, but not OWL/RDF, due to this annoying problem
>that the triples describing a single concept nested to an arbitrary
>depth may appear in an arbitrary order.

What kind of nesting are you talking about?

>  We cannot find any way round
>this given the (intentional) limitations of RDF.  Unfortunately, Sean, I
>think you're out of luck.


-- ,
Jos De Roo, AGFA http://www.agfa.com/w3c/jdroo/

Received on Friday, 14 February 2003 20:41:18 UTC