W3C home > Mailing lists > Public > public-rdf-dawg-comments@w3.org > March 2005

Re: Turtle Tuples: Turtle-based query result format

From: Arjohn Kampman <arjohn.kampman@aduna.biz>
Date: Wed, 23 Mar 2005 16:57:26 +0100
Message-ID: <424191E6.1010701@aduna.biz>
To: Dave Beckett <dave.beckett@bristol.ac.uk>
Cc: Dan Connolly <connolly@w3.org>, public-rdf-dawg-comments@w3.org

Dave Beckett wrote:
> My comments on the points in the thread.
> 
> Less overhead - yeah.  Although Arjohn's later reply says it isn't
> always lower in speed.

The performance improvements were a bit disappointing, at least in
combination with Sesame. In a client-server setting, a Sesame server
automatically applies gzip-compression to any query results; apparently
Sesame XML format compresses better than the Turtle Tuples format, as
the compressed files were comparable in size.

One factor that influenced the results was the fact that we performed
tests on meta-data for URLs. Most URLs tend to end in a file name that
includes a file extension (e.g. "index.html"). Because of the dot for
the file extension, these URLs had to be written as full URIs in the
result format as Turtle doesn't alllow dots in qnames.

As a result of the disappointing performance results, I decided to
implement a binary format. This binary format gave much better results,
giving roughly a factor 2 in increased performance for our application
(a combination of Aduna Spectacle and Aduna Metadata Server). This
format is documented at [1] if you're interested.

[...]
> Easy to write - although this may be true for those familiar with
> N3/Turtle style languages, this is a query result format and that's
> either being written by query processors (so easy to write isn't
> critical) or by query engine developers and people working on the SPARQL
> language and tests - a small group!

The point that was being made is that the format would be easier to
write by these query processors. An XML format requires one to specify
any namespace prefixes at the start of the document, which makes it
harder to write in a streaming fashion.

[...]
> I'm not sure this is something I'd prioritise now over, say, getting the
> XML format more polished after feedback.

I agree, but please 'fix' this XML format (I don't like the "variables
as tags" thing very much, as I pointed out in an earlier mail to this
list:-) ).

Cheers,

Arjohn

[1] 
http://www.openrdf.org/doc/api/sesame/org/openrdf/sesame/query/BinaryTableResultConstants.html
Received on Wednesday, 23 March 2005 15:57:29 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:14:48 GMT