W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > April to June 2011

Re: sparql11-results-csv-tsv

From: Steve Harris <steve.harris@garlik.com>
Date: Tue, 14 Jun 2011 19:45:29 +0100
Cc: public-rdf-dawg@w3.org
Message-Id: <E988E113-DB52-40AB-9128-19A514CC7E3C@garlik.com>
To: Andy Seaborne <andy.seaborne@epimorphics.com>
On 2011-06-14, at 18:20, Andy Seaborne wrote:
> 
> On 14/06/11 18:03, Steve Harris wrote:
>> On 2011-06-14, at 12:38, Andy Seaborne wrote:
>>> 
>>> On 14/06/11 12:04, Steve Harris wrote:
>>>> Quick mini-review:
>>>> 
>>>> Abstract
>>>> 
>>>> 4store answers ASK in TSV as "true", but it should probably be
>>>> "ask\ntrue" or similar.
>>> 
>>> Yes, it would have to be if it's TSV compatible, the header line is needed - whether an first empty record is (in practical terms) is safe, I don't know.
>> 
>> I think it would be friendlier on libraries to include a header.
> 
> I agree, it is generally better but there are cases where an implementation may wish not to.  Hence the language is:
> 
> """
> The SPARQL CSV Results Format SHOULD use of a header row. If the header row is not present, this MUST be indicated by content type parameter header=absent.
> """

I don't feel that's really in the spirit of RFC 4180:

     “The "header" parameter indicates the presence or absence of the
      header line.  Valid values are "present" or "absent".
      Implementors choosing not to use this parameter must make their
      own decisions as to whether the header line is present or absent.”

If we're going to make the header row optional then we should recommend that the header= parameter is included, for maximum interoperability with existing CSV consumers.

Are there existing SPARQL implementations that omit the header? If so, under what circumstances? If it's genuinely useful would should provide a mechanism for clients to request it be omitted, otherwise it's just pot luck.

> SHOULD (RFC 2119) is quite strong
> [[
> there may exist valid reasons in particular circumstances to ignore a
> particular item
> ]]
> 
> and MUST is included so the default is defined.
> 
> Because CSV allows it, I think we should aim for maximum interoperability - We are not registering text/sparql-results+csv.

True, but this is a document explaining how to express SPARQL results in TSV, so any well-defined subset is fair game, in my opinion. If I generate Turtle, but only use . to end triples, it's still text/turtle.

I could see the argument if the aim was to make any valid CSV document a valid SPARQL results document, but I just don't see the value in that.

> A column of number for use in a larger spreadsheet table of results (so the column is meaningless) is such particular circumstance.

It's not a usecase I've come across, and I regularly feed SPARQL results into Excel — as TSV, but the principle is the same. I often replace the header row with something more human friendly, but I've never wished it was absent. Many useful things in Excel (e.g. LOOKUP/MATCH, pivot tables) expect a header anyway.

- Steve

-- 
Steve Harris, CTO, Garlik Limited
1-3 Halford Road, Richmond, TW10 6AW, UK
+44 20 8439 8203  http://www.garlik.com/
Registered in England and Wales 535 7233 VAT # 849 0517 11
Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD
Received on Tuesday, 14 June 2011 18:46:00 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:46 GMT