Re: RDF Thrift : A binary format for RDF data from Andy Seaborne on 2014-08-18 (semantic-web@w3.org from August 2014)

From: Andy Seaborne <andy@seaborne.org>
Date: Mon, 18 Aug 2014 09:45:41 +0100
To: semantic-web@w3.org
Message-ID: <53F1BD35.4060703@seaborne.org>

On 18/08/14 07:53, Axel Polleres wrote:
> On 18 Aug 2014, at 08:50, Axel Polleres <axel@polleres.net> wrote:
>
>> Hi Andy,
>>
>> This looks very interesting! Any thoughts already on how this relates to/could be combined with HDT [1,2]?

Summary: they have the word "binary" in common.

That's about it.  They are just different things for different usages.

How much is HDT used for real?  By whom?  I couldn't find HDT files.

> Sorry, overlooked the later mail in this thread... will check. The p.s. below still holds, though. ;-)

W3C has strengths and weaknesses.

It is good as a place to combine different/related/competing work; it is 
good when there is a need and no answer on the table.

It is bad at formalising a single existing (small) peice of technology. 
  WG add features as compromises.  If the focus is small, a lot of extra 
cruft can get added, a lot of other agendas piggyback on the WG (see the 
shapes WG as a good example of this effect).

I think it is better for the community to refine something as simple as 
RDF Thrift.  Implementation is of the order of 100 lines of code + a 
thrift compiler (that someone else wrote - Thrift is big in Big Data).

(I happened to also rewire bits Jena to be streaming - I'm intersted in 
scalable processing ATM.  Must of the Jena "implementation" is nothing 
to do RDF Thrift - it's a bunch of reworking internals for stream 
processing that have been on the cards for a while now.)

	Andy

>
>>
>> best regards,
>> Axel
>>
>> p.s.: BTW, personally, I think it would be great to think about whether binary, efficient encodings
>> for RDF and similar formats would have a space for standardization in W3C.
>>
>>
>> 1. http://www.w3.org/Submission/2011/03/
>> 2. http://www.w3.org/Submission/2011/SUBM-HDT-20110330/
>>
>> --
>> Prof. Dr. Axel Polleres
>> Institute for Information Business, WU Vienna
>> url: http://www.polleres.net/  twitter: @AxelPolleres
>>
>> On 15 Aug 2014, at 16:19, Andy Seaborne <andy@apache.org> wrote:
>>
>>> RDF Binary using Apache Thrift
>>>
>>> This is a binary format for RDF graphs, datasets and SPARQL result sets that is fast to process. [1]
>>>
>>>   http://afs.github.io/rdf-thrift/
>>>
>>> includes the on-the-wire description as well as an implementation.
>>>
>>> Using Apache Thrift makes it considerably less work to integrate into existing systems and toolkits, or to build custom processing. [2]
>>>
>>> 	Comments and feedback welcome,
>>> 	Andy
>>>
>>> [1] The largest gain is on reading data, with rates x3 faster than parsing N-Triples.
>>>
>>> [2] Apache thrift has a large number of implementations across a range of languages: http://wiki.apache.org/thrift/LibraryFeatures
>>>
>>
>
>

Received on Monday, 18 August 2014 08:46:11 UTC