W3C home > Mailing lists > Public > www-rdf-interest@w3.org > March 2004

RE: Tools for 20 million triples?

From: Bradley P. Allen <ballen@siderean.com>
Date: Thu, 25 Mar 2004 10:08:46 -0800
To: "RDF Interest Group" <www-rdf-interest@w3.org>
Message-ID: <BLECLCINHHDCFGOPLGCPKEOLEBAA.ballen@siderean.com>

We've managed up to 160 million triples by clustering Seamark servers across
several Linux boxes on a LAN. This was in the context of a faceted
navigation system built for a collection of about 8 million resources
described primarily in Dublin Core, providing sub-second query responses
under heavy loads. This kind of approach should scale to the gigatriple
level. - cheers, BPA

Bradley P. Allen
Siderean Software LLC
5155 West Rosecrans Avenue, Suite 1078
Los Angeles, CA 90250
Phone +1 310 491-3424
Fax +1 310 379-0231
Web www.siderean.com

> -----Original Message-----
> From: www-rdf-interest-request@w3.org
> [mailto:www-rdf-interest-request@w3.org]On Behalf Of Jeen Broekstra
> Sent: Thursday, March 25, 2004 4:46 AM
> To: Charles McCathieNevile
> Cc: RDF Interest Group
> Subject: Re: Tools for 20 million triples?
>
>
>
> Charles McCathieNevile wrote:
>
> > on another list someone asked what tools would be good for handling
> >  an OWL ontology of about 25,000 terms, with around 20 million
> > triples. There were a handful of ideas about how to build
> > specialised SQL systems or similar, but Danny Ayers pointed out
> > that there are systems capable of handling RDF and a lot of triples
> >  (which by lucky chance happens to be a way of storing OWL).
> >
> > So I wondered if anyone on this list had experience of tools
> > working with this size dataset. (I will read Dave Beckett's report
> > done for SWAD-Europe on the topic, but I suspect that there is
> > already new information available, and would like to be up to
> > date).
>
> It depends on your hardware of course. Given a reasonably fast server,
> Sesame in combination with a MySQL DB (or even in-memory, given
> enough RAM) should be able to handle this without too much of a problem.
>
> To be honest with you though, largest set I've personally worked with
> in Sesame was about 5 million, and that could be slow at times (though
> that may have been because I was running it on my notebook).
>
> Jeen
> --
> Jeen Broekstra          Aduna BV
> Knowledge Engineer      Julianaplein 14b, 3817 CS Amersfoort
> http://aduna.biz        The Netherlands
> tel. +31(0)33 46599877  fax. +31(0)33 46599877
>
>
Received on Thursday, 25 March 2004 13:09:56 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 18 February 2014 13:20:07 UTC