Re: RDF Thrift : A binary format for RDF data from Michel Dumontier on 2014-08-18 (semantic-web@w3.org from August 2014)

From: Michel Dumontier <michel.dumontier@gmail.com>
Date: Mon, 18 Aug 2014 10:06:31 -0700
To: Ruben Verborgh <ruben.verborgh@ugent.be>
Cc: Andy Seaborne <andy@seaborne.org>, SWIG Web <semantic-web@w3.org>
Message-ID: <CALcEXf5GgtHhP9wFwmD7-LUJG532rU3ZrJpjT1Na620Dr8h0SA@mail.gmail.com>

On Mon, Aug 18, 2014 at 2:16 AM, Ruben Verborgh <ruben.verborgh@ugent.be> wrote:
> Hi Andy,
>
>> How much is HDT used for real?
>
> We use it to enable client-side SPARQL query execution with 99.9% availability.
> Here is an online demo: http://client.linkeddatafragments.org/.
>
> The HDT files are used to run the server at http://data.linkeddatafragments.org/.
> Details on why HDT is a good format for this are here [1].
>
>> By whom?
>
> We (Ghent University – iMinds) use it to host high-availability queryable datasets.
> The software that enables this is available as open source [2],
> so anybody else can use it to do the same.
>
>> I couldn't find HDT files.
>
> For the same reason you won't find Virtuoso db files: we use it on the server.
actually, you can! The Bio2RDF project makes their indexed Virtuoso
dbs available.

http://download.bio2rdf.org/release/3/

we also provide gzipped nquads, and we'd be interested in providing an
alternative binary, indexed format.

m.

> As you said, Thrift and HDT have different design goals.
> Thrift files are meant to be “found“, HDT files not necessarily.
>
> BTW you can find HDT files here: http://www.rdfhdt.org/datasets/
> And the tools to make them yourself: http://www.rdfhdt.org/download/
>
> Ruben
>
> PS I might be interested to look at a JavaScript/Node.js implementation of Thrift.
> Are there any plans (or code) in that direction already? Pointers to start?
>
> [1] http://linkeddatafragments.org/publications/iswc2014.pdf
> [2] https://github.com/LinkedDataFragments/

Received on Monday, 18 August 2014 17:07:19 UTC