Re: while we are rechartering.... (csv) from Andy Seaborne on 2011-06-01 (public-rdf-dawg@w3.org from April to June 2011)

From: Andy Seaborne <andy.seaborne@epimorphics.com>
Date: Wed, 01 Jun 2011 15:09:00 +0100
To: public-rdf-dawg@w3.org
Message-ID: <4DE647FC.2050106@epimorphics.com>

On 01/06/11 14:38, Bijan Parsia wrote:
> On 1 Jun 2011, at 14:32, Sandro Hawke wrote:
>
>> TimBL suggests we also add a SPARQL CSV Results Format.
>> (comma-separated-values, RFC 4180).
>>
>> I think this is a good idea, probably worthwhile.  Particularly with
>> government data, I see CSV is surprisingly popular.
>>
>> (The use case is: some people want to publish CSV data, because they
>> have a consumer community who wants it.  Rather than write any code, if
>> we do this, they can just construct a SPARQL query to provide the data,
>> give people a long URL (including the query) for getting that data as
>> CSV.  Or they could hide the long URL behind a short one, via a proxy of
>> some sort.  I find this use case rather compelling.)
>>
>> Reactions?   like it, don't care, formally object, ...?
>
>
> It's hugely useful. Lots of things consume CSV best (e.g., lots of google stuff) so on many of our LOD projects we end up hacking some XSLT or other processor in the middle.
>
> It would really have to be "ready to go" as CSV rather than a csv encoding of the e.g., xml formats, or it's quite pointless.
>
> Cheers,
> Bijan.
>
>

Sandro : +1

Bijan: Is this CSV "ready to go" enough?

4Store [1] has TSV, Jena/ARQ [2] and Redland [3] have support for CSV 
and TSV formats.

The CSV format is pragmatic and lossy - the terms are printed without 
syntax stuff so URIs don't have <>, literals are the lexical form, 
without quotes, and any quoting is purely for CSV reasons.

CSV:
One row of variable names, without the "?"
Then rows of strings and numbers.
No lang tags or datatypes on literals, no markers to tell strings and
URIs apart.
End of line is \r\n as required by RFC 4180

The TSV format is lossless so you do get <http://example/> and "foo"@en, 
123 etc etc  It can be read back in as a result set without loss.

The first row is variable names with ?
Then rows of RDF terms in SPARQL/Turtle format.
Literals have quotes, and lang tags/datatypes are added.
URIs have <> round them.

Both formats have proved useful and easy to implement.  The CSV form is 
quite easy to consume for non-RDF applications.

	Andy

[1] http://lists.w3.org/Archives/Public/semantic-web/2010Jan/0300.html
[2] http://lists.w3.org/Archives/Public/semantic-web/2010Jan/0302.html
[3] http://lists.w3.org/Archives/Public/semantic-web/2010Jan/0309.html

Received on Wednesday, 1 June 2011 14:09:40 UTC