Re: [TTL] Standardizing N-Triples from Peter Frederick Patel-Schneider on 2011-04-04 (public-rdf-wg@w3.org from April 2011)

From: Peter Frederick Patel-Schneider <pfps@research.bell-labs.com>
Date: Mon, 4 Apr 2011 08:17:10 -0400
To: <steve.harris@garlik.com>
CC: <eric@w3.org>, <andy.seaborne@epimorphics.com>, <nathan@webr3.org>, <alexhall@revelytix.com>, <richard@cyganiak.de>, <public-rdf-wg@w3.org>
Message-ID: <20110404.081710.1168528961674657838.pfps@research.bell-labs.com>

From: Steve Harris <steve.harris@garlik.com>
Subject: Re: [TTL] Standardizing N-Triples
Date: Sun, 3 Apr 2011 05:24:18 -0500

> On 2011-04-02, at 21:28, Eric Prud'hommeaux wrote:
>>> The relative IRI thing can be achieved by using and serving up
>>> Turtle. We could therefore keep N-triples with a design centered on
>>> a dump format, and sticking to only absolute IRIs makes sense there
>>> to "freeze" the data.
>> 
>> It's actually this use case which motivated me to consider the value
>> of transporability. Well, that, plus simple generator scripts (for
>> e.g. dumping a database) which are portable between systems if they
>> don't embed a base IRI. I'm not sure this matters a lot one way or the
>> other; just trying to guess the discriminators which will cause folks
>> to use NTriples.
>> 
>> I'm not actually convinced that it's worth foisting another
>> sublanguage (or profile, if you prefer) on the world. I understand
>> that the principle motivation is the efficiency of dumping an
>> reloading, but I expect that far more clock cycles get introduced
>> responsibly lexing IRIs and unicode literals than by all the rest of
>> productions which distinguish turtle from ntriples.
> 
> I have to disagree. I've not built a full Turtle parser myself, but
>I've built an N-Triples one, and a good portion of a Turtle one (both
>by hand, not with a compiler-compiler), and the N-Triples one is
>significantly more efficient, per triple.  
> 
> As further evidence the raptor N-Triples parser is also significantly
>faster per triple than the Turtle one. 
> 
> The fact that people are using N-Triples in preference to Turtle for
>large dumps currently seems like good evidence that it is useful for
>some cases. 
> 
> - Steve

I would like to see this sort of argument backed up with numbers
including all costs, such as I/O.  Ideally, such arguments should come
with code, so that the quality of the implementation can be checked. 

peter

Received on Monday, 4 April 2011 12:18:45 UTC