Re: resources for network-based/hierarchical RDF store

On 30 Apr 2007, at 20:34, Bradley Allen wrote:
> The benchmark we did with Elsevier was performed on a
> hierarchically-clustered grid of 32 commodity Linux boxes, each  
> running an
> instance of Seamark Navigator. The RDF represented the bibliographical
> information describing 40 million articles plus 10 million  
> descriptions of
> authors. The application was an end-user relational navigation  
> interface
> over the collection of articles and authors.
> A secondary difference, in contrast to the Garlik store, is that  
> this is a
> commercially-supported software product as opposed to a hosted Web  
> service,
> although we do provide hosting for applications like the the Oracle
> Technology Network Semantic Web (

For comparison the Garlik store (JXT) in our production system stores  
just over 2 gigaquads on 8 commodity Linux boxes. Typical query  
response time for our application is 2-3ms per query - using the  
SPARQL language, but not the protocol. However, one chunk of RDF is  
not like another, so I don't want to draw direct comparisons. The  
queries are fairly unexciting SPARQL queries, 8-9 triple patterns  
with 2 or 3 OPTIONAL clauses, some have simple FILTER expressions.  
Each report generated for a user runs a few hundred SPARQL queries of  
that type, and it happens in around a second.

It has ACID transactions and N-way failover redundancy to support the  
high uptime needed to run a sizeable business off an RDF store.

I'm not sure what you mean by "hosted Web service", but the Garlik  
store is currently only for internal use. It's supports our  
commercial data management service, but access to the store is not  
available directly to customers.

- Steve

Received on Monday, 30 April 2007 22:09:01 UTC