- From: Alfredo Serafini <seralf@gmail.com>
- Date: Wed, 12 Feb 2014 14:06:31 +0100
- To: Alf Eaton <eaton.alf@gmail.com>
- Cc: "public-csv-wg@w3.org" <public-csv-wg@w3.org>
- Message-ID: <CADawF4PwbuQNZ+uOW0YwLTO1JXzROZCJX4PQVjpXGZqCOEHaig@mail.gmail.com>
great collection of examples! Alfredo 2014-02-12 13:54 GMT+01:00 Alf Eaton <eaton.alf@gmail.com>: > One problem with sampling existing CSV files is that they’re quite > likely to already be well structured, and limited in what they do by > the existing constraints of the CSV format. > > What’s arguably more useful is to sample the range of Excel files that > have been published, to see if there's more that needs to be > supported. To start with, I’ve produced a list of URLs of Excel files > that have been published as supporting information for articles on > nature.com: https://gist.github.com/hubgit/8954821/ > > (Run `wget -i https://gist.github.com/hubgit/8954821/raw/nature-xls.txt` > to fetch them all). > > These files show a wide range of structure that authors actually add > to tabular data, many of which are possible in HTML tables but not in > CSV files. Perhaps a JSON file accompanying a CSV file may be able to > cover some of these features? > > Examples of features found in Excel spreadsheets published as > supporting data for journal articles: > > * Table description and comment rows (sometimes starting with #) at > the start of the sheet > * Multiple tables in the same sheet, with a title row for each table > * Merged cells, spanning multiple rows or columns > * Text formatting (bold, italic), e.g. species names, or to show > significance > * Cell formatting (background colours), to highlight grouping or patterns > * Caption (description), footer, footnotes > * Subheadings/subsections within a single table, often with indented > headings > > Alf > > On 11 February 2014 16:21, Dan Brickley <danbri@google.com> wrote: > > On 11 February 2014 16:03, Jeni Tennison <jeni@jenitennison.com> wrote: > >> Of interest to this group, this work from Max Ogden on putting together > a set of test cases for CSV parsers: > >> > >> https://github.com/maxogden/csv-spectrum > > > > Oh, that's great. I went through the Open Office source tree last week > > looking for similar, but didn't find anything suitable. > > > > Dan > > > >> Jeni > >> -- > >> Jeni Tennison > >> http://www.jenitennison.com/ > >> > > > >
Received on Wednesday, 12 February 2014 13:06:59 UTC