Re: Use Case: Multiple Data Sets in Single File

Good point Chris.  Welcome to the group.  I agree, in fact as a matter of practicality tar balls or zip files are used on archives as a practical matter to bundle bunches of smaller files collected.  This reduces the file  inode count in the file system.

Eric

> On Feb 21, 2014, at 11:17 AM, Chris Metcalf <chris.metcalf@socrata.com> wrote:
> 
> I haven't formally introduced myself to the group yet, and I'm probably not
> fully caught up, but why not a zip file (or tarball) containing each
> dataset as a distinct .csv file along with a manifest file indexing the
> contents? That's a format that has become fairly common across the
> industry, and is easily machine unpackable and readable.
> 
> Thanks,
> 
> Chris Metcalf
> Director of Platform
> 
> chris.metcalf@socrata.com
> http://www.socrata.com
> 
> 
> 
>> On Fri, Feb 21, 2014 at 5:48 AM, Craig Russell <craig@craig-russell.co.uk> wrote:
>> It's simple, yes, but is it the most appropriate for all contexts? I think the question of how best to represent multiple data sets (and metadata) in a single file is worth thinking about.
>> 
>>> On 20 February 2014 17:16, Alf Eaton <eaton.alf@gmail.com> wrote:
>>> On 20 February 2014 10:39, Craig Russell <craig@craig-russell.co.uk> wrote:
>>> 
>>> > The downloaded CSV file may include multiple data sets, which are separated by a couple of line breaks (example attached).
>>> 
>>> > There is, at present, no clear machine readable way of differentiating these two data sets within a single file.
>>> 
>>> I like the simplicity of using 2 (or more) line breaks to imply a
>>> separation between tables (as well to imply the separation between
>>> descriptive header text and the actual table).
>>> 
>>> Alf
>> 
>> 
>> 
>> -- 
>> Craig Russell
>> e: craig@craig-russell.co.uk
>> w: craig-russell.co.uk
>> t: @craig552uk
> 

Received on Saturday, 22 February 2014 23:18:07 UTC