- From: Christopher Gutteridge <cjg@ecs.soton.ac.uk>
- Date: Thu, 22 May 2014 14:33:07 +0100
- To: Andy Seaborne <andy@apache.org>, Gregg Kellogg <gregg@greggkellogg.net>
- CC: Ivan Herman <ivan@w3.org>, W3C CSV on the Web Working Group <public-csv-wg@w3.org>
On 22/05/2014 12:53, Andy Seaborne wrote: > On 22/05/14 12:13, Christopher Gutteridge wrote: >> There's also the issue of a repeated heading. I've encountered that. eg. >> >> ID, Title, Contact 1, Email, Contact 2, Email > > And indeed no headings and RTL with repeated headings. > > Do you think more needs to be said in, say, > > http://w3c.github.io/csvw/syntax/index.html#core-tabular-data-model > or > http://w3c.github.io/csvw/syntax/index.html#headers > > ? > > (Looking at that, I was expecting #core-tabular-data-model to say that > "columns MAY have titles") > The idea of specifying headers or not in the mimetype is all very well, but that rather assumes we have a mimetype and that our data started as CSV (most of mine starts live as Excel or Sharepoint exports.) I think that repeated headers is an edge case but it would be helpful to define a default behaviour. - One option would be to fall back to treating it as an unheaded column. - Another would be to append -2 -3 etc. based on repeats. Could cause an issue if someone maliciously made headings: X, X, X-2 as then you'd end up with X,X-2,X-2 and still have a clash. - Another would be to append the column number to the repeated heading. - To ignore data from that column entirely. - To throw an error and refuse to process it. This should also include a way to address empty headings, eg. if there was a heading row, but column 7 didn't have a heading but did contain data. -- Christopher Gutteridge -- http://users.ecs.soton.ac.uk/cjg University of Southampton Open Data Service: http://data.southampton.ac.uk/ You should read the ECS Web Team blog: http://blogs.ecs.soton.ac.uk/webteam/
Received on Thursday, 22 May 2014 13:34:14 UTC