W3C home > Mailing lists > Public > public-csv-wg@w3.org > February 2014

Re: Use Case: Multiple Data Sets in Single File

From: Eric <ericphb@gmail.com>
Date: Sat, 22 Feb 2014 15:17:37 -0800
Cc: Craig Russell <craig@craig-russell.co.uk>, Alf Eaton <eaton.alf@gmail.com>, "public-csv-wg@w3.org" <public-csv-wg@w3.org>
Message-Id: <8FAEE242-AB31-4759-94B0-6DF1F86C4FE3@gmail.com>
To: Chris Metcalf <chris.metcalf@socrata.com>
Good point Chris.  Welcome to the group.  I agree, in fact as a matter of practicality tar balls or zip files are used on archives as a practical matter to bundle bunches of smaller files collected.  This reduces the file  inode count in the file system.

Eric

> On Feb 21, 2014, at 11:17 AM, Chris Metcalf <chris.metcalf@socrata.com> wrote:
> 
> I haven't formally introduced myself to the group yet, and I'm probably not
> fully caught up, but why not a zip file (or tarball) containing each
> dataset as a distinct .csv file along with a manifest file indexing the
> contents? That's a format that has become fairly common across the
> industry, and is easily machine unpackable and readable.
> 
> Thanks,
> 
> Chris Metcalf
> Director of Platform
> 
> chris.metcalf@socrata.com
> http://www.socrata.com
> 
> 
> 
>> On Fri, Feb 21, 2014 at 5:48 AM, Craig Russell <craig@craig-russell.co.uk> wrote:
>> It's simple, yes, but is it the most appropriate for all contexts? I think the question of how best to represent multiple data sets (and metadata) in a single file is worth thinking about.
>> 
>>> On 20 February 2014 17:16, Alf Eaton <eaton.alf@gmail.com> wrote:
>>> On 20 February 2014 10:39, Craig Russell <craig@craig-russell.co.uk> wrote:
>>> 
>>> > The downloaded CSV file may include multiple data sets, which are separated by a couple of line breaks (example attached).
>>> 
>>> > There is, at present, no clear machine readable way of differentiating these two data sets within a single file.
>>> 
>>> I like the simplicity of using 2 (or more) line breaks to imply a
>>> separation between tables (as well to imply the separation between
>>> descriptive header text and the actual table).
>>> 
>>> Alf
>> 
>> 
>> 
>> -- 
>> Craig Russell
>> e: craig@craig-russell.co.uk
>> w: craig-russell.co.uk
>> t: @craig552uk
> 
Received on Saturday, 22 February 2014 23:18:07 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:21:38 UTC