Re: DataStore, Layers and legacy files from David Booth on 2015-03-04 (public-csv-wg@w3.org from March 2015)

From: David Booth <david@dbooth.org>
Date: Tue, 03 Mar 2015 19:42:23 -0500
To: Paul Klink <paul@klink.id.au>, public-csv-wg@w3.org
Message-ID: <54F654EF.2020004@dbooth.org>
Hi Paul,

If I've understood correctly, it sounds like you are suggesting that the 
W3C work define or adopt interface layers (perhaps like the ones defined 
by Microsoft), so that different kinds of data sources (including CSV 
files) could be accessed via a common interface.

That certainly sounds very useful to programmers, but I think it goes 
significantly beyond what this W3C working group is chartered to 
achieve.  I think the goal of this group is to define standard 
conventions that enable the *meaning* of a CSV file to be determined 
automatically, independent of any API.  A library implementing an API 
like you describe would probably build on the work of this group, by 
implementing the standard that it defines.

With that said, it sounds like the work of this group may at least 
partially address the transport use case that you mention, because part 
of the issue in transport involves somehow communicating the meaning of 
the data to the recipient, and the work of this group does address that.

FYI, here's the working group charter:
http://www.w3.org/2013/05/lcsv-charter.html

Thanks!
David Booth

On 02/25/2015 03:40 AM, Paul Klink wrote:
> Hi all,
>
> I am the guy working on the "Fielded Text" standard mentioned in
> December.  I just did an update to the standard
> (http://www.fieldedtext.org/Standard) and while doing so, gave some
> thought regarding the aims of Fielded Text compared to the aims of "CSV
> on the Web".
>
> Fielded Text focuses on standardising the encoding and decoding of
> tabular data into text for transport purposes.  It aims to support as
> wide as possible range of text formats (delimited and fixed width) and
> to provide as much compatibility with existing text files as possible.
> The Meta in fielded text is limited to only that which is needed for
> encoding and decoding. It recognises attributes or behaviour that are
> implicit in the text files with tabular data (eg. headings, comments,
> null values) and only adds a few considered essential to encoding (eg.
> typed fields, field names and Ids).  It also adds some attributes to
> support round tripping (eg. write formats).
>
> A good analogy to Fielded Text is string encoding/decoding.  If you want
> to move text from one system to a different system, you will encode it
> to one of the well known formats (say UTF-8 or a MBCS). The person at
> the other end will then able to decode it using standardised methods to
> import the text into their system.
>
> As I see it, "CSV on the Web" is more focused on publishing (as opposed
> to transport).  The Meta data for "CSV on the Web" assigns a far greater
> number of attributes to the tabular data. The aim with this seems to be
> to provide more information about the data within the files, describe
> linkages between files, assist with transformations and control access
> to them.  In my view it seems like it's aiming to be a Text Database
> focused on publishing, using CSV as the data store.
>
> After I considered the above, I realised that Fielded Text covers a
> subset of "CSV on the Web".  Specifically access to tabular data in the
> data store.
>
> In .NET Microsoft defined a couple of interfaces which could be
> construed as providing layers to data store access.  These are
> IDataRecord and IDataReader.
>
> These are documented at:
> -
> https://msdn.microsoft.com/en-us/library/system.data.idatarecord%28v=vs.110%29.aspx
>
> -
> https://msdn.microsoft.com/en-us/library/system.data.idatareader%28v=vs.110%29.aspx
>
>
> It was surprisingly easy to implement these interfaces in my
> implementation of FieldedText:
> http://sourceforge.net/p/tfieldedtext/code/ci/default/tree/delphi/2/Xilytix.FieldedText.DotNetDataReader.pas
>
>
> After having said all of the above, here is a suggestion.
>
> If the "CSV on the Web" defined layers similar to the above for
> accessing the Data Store, other standards such as Fielded Text could be
> used to specify the implementation of the Data Store.
>
> For example, "CSV on the Web" Meta would define a field's name, data
> type and headings and then Fielded Text's Meta would define how that
> field is actually stored (Delimited or Fixed Width, delimiter character,
> format picture strings).
>
> The upside would be access to different types of data stores,
> potentially providing access to a large number of 'legacy' text files.
> The downside is that the standard is less constrained and
> implementations are more difficult to implement or may not provide
> complete coverage.
>
> Anyway, I am just floating it as an idea.  Hopefully you consider it
> relevant.
>
> Regards
> Paul
>
>
>
>
Received on Wednesday, 4 March 2015 00:42:52 UTC