Re: [TTL] Standardizing N-Triples

From: Steve Harris <steve.harris@garlik.com>
Subject: Re: [TTL] Standardizing N-Triples
Date: Sun, 3 Apr 2011 05:24:18 -0500

> On 2011-04-02, at 21:28, Eric Prud'hommeaux wrote:
>>> The relative IRI thing can be achieved by using and serving up
>>> Turtle. We could therefore keep N-triples with a design centered on
>>> a dump format, and sticking to only absolute IRIs makes sense there
>>> to "freeze" the data.
>> 
>> It's actually this use case which motivated me to consider the value
>> of transporability. Well, that, plus simple generator scripts (for
>> e.g. dumping a database) which are portable between systems if they
>> don't embed a base IRI. I'm not sure this matters a lot one way or the
>> other; just trying to guess the discriminators which will cause folks
>> to use NTriples.
>> 
>> I'm not actually convinced that it's worth foisting another
>> sublanguage (or profile, if you prefer) on the world. I understand
>> that the principle motivation is the efficiency of dumping an
>> reloading, but I expect that far more clock cycles get introduced
>> responsibly lexing IRIs and unicode literals than by all the rest of
>> productions which distinguish turtle from ntriples.
> 
> I have to disagree. I've not built a full Turtle parser myself, but
>I've built an N-Triples one, and a good portion of a Turtle one (both
>by hand, not with a compiler-compiler), and the N-Triples one is
>significantly more efficient, per triple.  
> 
> As further evidence the raptor N-Triples parser is also significantly
>faster per triple than the Turtle one. 
> 
> The fact that people are using N-Triples in preference to Turtle for
>large dumps currently seems like good evidence that it is useful for
>some cases. 
> 
> - Steve

I would like to see this sort of argument backed up with numbers
including all costs, such as I/O.  Ideally, such arguments should come
with code, so that the quality of the implementation can be checked. 

peter

Received on Monday, 4 April 2011 12:18:45 UTC