Re: Survey of RDF data on the Web from Seth Russell on 2002-08-19 (www-rdf-interest@w3.org from August 2002)

From: Seth Russell <seth@robustai.net>
Date: Mon, 19 Aug 2002 11:01:21 -0700
To: "Andreas Eberhart" <andreas.eberhart@i-u.de>, "Dan Brickley" <danbri@w3.org>
Cc: <www-rdf-interest@w3.org>
Message-ID: <009301c247aa$6d751fc0$657ba8c0@c1457248a.sttls1.wa.home.com>

From: "Andreas Eberhart" <andreas.eberhart@i-u.de>

> I'm trying to export all the facts as one large RDF file. I used Jena ARP
> and Sergey Melnik's RDF API, but with both I'm running out of main memory
> while filling the model (i.e. before I can serialize it as RDF). Is there
a
> possibility where not the entire data has to be held in main memory? Maybe
a
> two-pass approach, where the predicate namespaces are collected in the
first
> pass and the data is serialized during the second pass.

Processing large RDF files is very problematic not only when writing, but
also when reading.  Perhaps the solution is not to use large files.  Instead
use a lot of small files in one directory along with a RDF index file.  We
could use a convention that a directory of RDF files is located at
http://host/path/index.rdf  and then define a simple schema for these index
files.

Seth Russell
http://robustai.net/sailor/

Received on Monday, 19 August 2002 14:01:58 UTC