W3C home > Mailing lists > Public > semantic-web@w3.org > September 2007

Re: Triples storage

From: Renato Golin <renato@ebi.ac.uk>
Date: Wed, 26 Sep 2007 10:32:00 +0100
Message-ID: <46FA2710.9050602@ebi.ac.uk>
To: "Emanuele D'Arrigo" <manu3d@gmail.com>
CC: Semantic Web Interest Group <semantic-web@w3.org>, "public-owl-dev@w3.org" <public-owl-dev@w3.org>

Emanuele D'Arrigo wrote:
> Hi everybody!
> Has any consensus been reached on the architecture for efficient storage
> and retrieval of an ontology's triples?

Hi Manu,

I'm quite interested in triplet storages but what I found is that there
is no consensus nor standard for anything in that area. There are
several storage engines but each one doing it's own way. Also, the
support to query languages is quite random.

> I've read an interesting paper about a triple store based on hashed tables
> that intuitively sounded more efficient than a straightforward
> one-table approach.

Given the amount of data you can have the hash table might not fit in
any computer and even if it fits, I/O will become a huge problem.

This is a common misconception that hash tables are always faster than
lists but that's not true, especially when you have bigger hash tables
than your memory can hold (not that difficult).

The only way to have an efficient and still powerful storage engine is
to mix standards. For very local queries, hashes can be a good solution.
For locally distributed queries, lists and binary indexes might perform
better. But for truly distributed queries (outside of your domain) you
need an adaptive indexing system.

The more distributed you go slower it is, but that's acceptable when you
reckon the quality of your data will be higher that way.

> Are there alternatives? Where can I learn more about this specific aspect?

More alternatives than standards... see the Wiki pages to learn more:
http://esw.w3.org/topic/Semantic_Bioinformatics (storage at the end)


Reclaim your digital rights, eliminate DRM, learn more at
Received on Wednesday, 26 September 2007 09:33:54 UTC

This archive was generated by hypermail 2.4.0 : Tuesday, 5 July 2022 08:45:02 UTC