- From: Andy Seaborne <andy.seaborne@epimorphics.com>
- Date: Thu, 28 Jul 2011 09:35:58 +0100
- To: public-rdf-dawg@w3.org
On 27/07/11 20:40, Steve Harris wrote: > Conversely, we are big fan of the TSV format, as written. > > We've used a very similar format inside Garlik for 4-5 years, as it's > very efficient for Javascript/Perl/Python to process, without losing > any typing information, and also easy for humans to read. > > The format has been supported in 4store since its public release, and > it's reasonably widely used. > > The way I look at it is: CSV is for loading into spreadsheets, TSV is > for processing by bespoke software. > > - Steve > > On 2011-07-27, at 19:22, Lee Feigenbaum wrote: > >> Danny Kahn, a colleague of mine at Cambridge Semantics, My thanks to Danny for the time spent reviewing the document. >> looked over >> http://www.w3.org/2009/sparql/docs/csv-tsv-results/results-csv-tsv.html >> . He compared it with how we currently implement CSV and TSV >> results to SPARQL in Anzo. >> >> Here are the differences: >> >> 1. Both our CSV and TSV formats do not serialize the details of RDF >> terms. >> >> 2. Our implementation optionally includes headers for CSV. We don't >> use the header=absent content type parameter to indicate this. How is it optionally controlled? Simply by whatever the sending code decides? Just to be clear: "header=absent" is part of the RFC 4180, sec 3, "MIME Type Registration of text/csv", not a feature of the SPARQL CSV result format. The only thing SPARQL CSV adds is that if the field row is absent, than the "header=absent" must be present, which is not required by text/csv. >> 3. Our TSV implementation makes the header line optional, just as >> with CSV. http://www.iana.org/assignments/media-types/text/tab-separated-values says: """ The first line of this encoding is special, it contains the name of each field, separated by tabs. """ which I read as it not being optional. That said, general compliance to TSV and (more so) CSV "specs" is fairly loose in the wild. >> I have not been that engaged in this discussion yet, but I'm >> surprised to see these significant differences between CSV and TSV, >> whereas I normally view these as basically the same format but with >> a different separating character. I'm not a big fan of the TSV >> format as currently specified. >> >> Looking briefly over the document, I think the section on >> serializing CSV needs a bit of work -- it seems to specify the >> order that solution bindings should emitted in terms of the header >> row, but the header row is optional. The CSV format without header line is just presenting a table of values, with no variable binding. Steve and Greg have argued that it should be mandatory and, absent further comments, I plan to change the doc to make the header filed line mandatory. A mandatory header line is strengthening the table-of-variable-bindings view. I've now made this change so as to reflect http://www.w3.org/2009/sparql/docs/csv-tsv-results/results-csv-tsv.html#csv-table Let me know if you have any comments on the revised text. "needs a bit of work" --> do you have other comments in this area? >> Even in cases where the header >> row is omitted, rows needs to emit variables in a consistent order, >> right? In CSV, if there is no field row, then it is just a table of strings to be processed by the client application. There's no required relationship to variables, in particular, no relationship to the query SELECT line (SELECT *). So without a header line, the results are just "some stuff" -- with no real constraints but an client application / query processor pair can agree further constraints. >> >> Lee Andy
Received on Thursday, 28 July 2011 08:36:31 UTC