W3C home > Mailing lists > Public > public-csv-wg@w3.org > February 2014

Re: CSV test cases

From: Alfredo Serafini <seralf@gmail.com>
Date: Wed, 12 Feb 2014 14:06:31 +0100
Message-ID: <CADawF4PwbuQNZ+uOW0YwLTO1JXzROZCJX4PQVjpXGZqCOEHaig@mail.gmail.com>
To: Alf Eaton <eaton.alf@gmail.com>
Cc: "public-csv-wg@w3.org" <public-csv-wg@w3.org>
great collection of examples!

Alfredo


2014-02-12 13:54 GMT+01:00 Alf Eaton <eaton.alf@gmail.com>:

> One problem with sampling existing CSV files is that they’re quite
> likely to already be well structured, and limited in what they do by
> the existing constraints of the CSV format.
>
> What’s arguably more useful is to sample the range of Excel files that
> have been published, to see if there's more that needs to be
> supported. To start with, I’ve produced a list of URLs of Excel files
> that have been published as supporting information for articles on
> nature.com: https://gist.github.com/hubgit/8954821/
>
> (Run `wget -i https://gist.github.com/hubgit/8954821/raw/nature-xls.txt`
> to fetch them all).
>
> These files show a wide range of structure that authors actually add
> to tabular data, many of which are possible in HTML tables but not in
> CSV files. Perhaps a JSON file accompanying a CSV file may be able to
> cover some of these features?
>
> Examples of features found in Excel spreadsheets published as
> supporting data for journal articles:
>
> * Table description and comment rows (sometimes starting with #) at
> the start of the sheet
> * Multiple tables in the same sheet, with a title row for each table
> * Merged cells, spanning multiple rows or columns
> * Text formatting (bold, italic), e.g. species names, or to show
> significance
> * Cell formatting (background colours), to highlight grouping or patterns
> * Caption (description), footer, footnotes
> * Subheadings/subsections within a single table, often with indented
> headings
>
> Alf
>
> On 11 February 2014 16:21, Dan Brickley <danbri@google.com> wrote:
> > On 11 February 2014 16:03, Jeni Tennison <jeni@jenitennison.com> wrote:
> >> Of interest to this group, this work from Max Ogden on putting together
> a set of test cases for CSV parsers:
> >>
> >>   https://github.com/maxogden/csv-spectrum
> >
> > Oh, that's great. I went through the Open Office source tree last week
> > looking for similar, but didn't find anything suitable.
> >
> > Dan
> >
> >> Jeni
> >> --
> >> Jeni Tennison
> >> http://www.jenitennison.com/
> >>
> >
>
>
Received on Wednesday, 12 February 2014 13:06:59 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:21:38 UTC