- From: Steve Harris <S.W.Harris@ecs.soton.ac.uk>
- Date: Tue, 1 May 2007 08:01:17 +0100
- To: Bradley Allen <ballen@siderean.com>
- Cc: Danny Ayers <danny.ayers@gmail.com>, <semantic-web@w3.org>
On 1 May 2007, at 01:47, Bradley Allen wrote: > Steve- To clarify, by "hosted Web service," I was addressing the > application > you have built, and didn't mean to suggest that store access was > available > as a service directly. I see. I thought you meant something along the lines of the Amazon storage services. > As you say, it's a bit apples and oranges to draw direct > comparisons; we're > consuming a lot of RAM for the kind of indexing necessary to > support fast > aggregate operators, and when you add up the SPARQL queries with > the queries > using aggregate operators for the facet counts (which don't yet > have SPARQL > equivalents) we're looking at roughly the same number of queries, > on the > order of several hundred per rendered navigation page, which is > returned to > the browser in less than a second. This time includes the overhead of > inference to support transitive closure of hierarchical facets and > user-role-based filtering of results and result metadata for > security and > entitlement purposes. - BPA Yes, the lack of aggregate operators is one of the reasons were getting through so many SPARQL queries. I'm working on a SPARQL- inspired language that has support for aggregates at the moment. We're also doing entitlement based filtering (essential given the problem domain), but just using the GRAPH operator, so I suspect it's coarser than yours. - Steve > On 4/30/07 3:08 PM, "Steve Harris" <S.W.Harris@ecs.soton.ac.uk> wrote: > >> On 30 Apr 2007, at 20:34, Bradley Allen wrote: >>> >>> The benchmark we did with Elsevier was performed on a >>> hierarchically-clustered grid of 32 commodity Linux boxes, each >>> running an >>> instance of Seamark Navigator. The RDF represented the >>> bibliographical >>> information describing 40 million articles plus 10 million >>> descriptions of >>> authors. The application was an end-user relational navigation >>> interface >>> over the collection of articles and authors. >> ... >>> A secondary difference, in contrast to the Garlik store, is that >>> this is a >>> commercially-supported software product as opposed to a hosted Web >>> service, >>> although we do provide hosting for applications like the the Oracle >>> Technology Network Semantic Web (http://otnsemanticweb.oracle.com). >> >> For comparison the Garlik store (JXT) in our production system stores >> just over 2 gigaquads on 8 commodity Linux boxes. Typical query >> response time for our application is 2-3ms per query - using the >> SPARQL language, but not the protocol. However, one chunk of RDF is >> not like another, so I don't want to draw direct comparisons. The >> queries are fairly unexciting SPARQL queries, 8-9 triple patterns >> with 2 or 3 OPTIONAL clauses, some have simple FILTER expressions. >> Each report generated for a user runs a few hundred SPARQL queries of >> that type, and it happens in around a second. >> >> It has ACID transactions and N-way failover redundancy to support the >> high uptime needed to run a sizeable business off an RDF store. >> >> I'm not sure what you mean by "hosted Web service", but the Garlik >> store is currently only for internal use. It's supports our >> commercial data management service, but access to the store is not >> available directly to customers. >> >> - Steve > >
Received on Tuesday, 1 May 2007 07:01:24 UTC