Re: CSVs and provenance from Alfredo Serafini on 2014-02-28 (public-csv-wg@w3.org from February 2014)

From: Alfredo Serafini <seralf@gmail.com>
Date: Fri, 28 Feb 2014 10:23:08 +0100
To: Yakov Shafranovich <yakov-ietf@shaftek.org>
Cc: "Ceolin, D." <d.ceolin@vu.nl>, Eric Stephan <ericphb@gmail.com>, W3C CSV on the Web Working Group <public-csv-wg@w3.org>
Message-ID: <CADawF4MSU1fj0ipJ7p7PbZOZ=XbBpXQq3evqX7Efk0Tgezu+fA@mail.gmail.com>

a "processing agent" or something similar is a great idea!
I think it could be even useful for RDF serialization formats, which often
have strange framing or conventions!

Alfredo


2014-02-28 5:23 GMT+01:00 Yakov Shafranovich <yakov-ietf@shaftek.org>:

> I think something similar to the concept of "User-Agent" in HTTP or
> email would be helpful. Knowing what software and version generated a
> given CSV file would help to interpret it.
>
> Not sure if this fits within the concept of provenance.
>
> Yakov
>
> On Thu, Feb 27, 2014 at 2:39 PM, Ceolin, D. <d.ceolin@vu.nl> wrote:
> > Hi Eric,
> >
> > I should have something, but not much. So yes please, that would be very
> helpful.
> > Thanks,
> >
> > Davide
> >
> > Il giorno 27/feb/2014, alle ore 15.48, Eric Stephan ha scritto:
> >
> >> Davide,
> >>
> >> Great idea, I feel this is very important and a huge problem for
> >> anyone who has to maintain a CSV and track changes.  I'd love to see a
> >> use case on this.  If you need any help with a real world use case let
> >> me know, there are plenty in the science arena.
> >>
> >>
> >> Eric
> >>
> >> On Thu, Feb 27, 2014 at 1:01 AM, Ceolin, D. <d.ceolin@vu.nl> wrote:
> >>> Hi all,
> >>>
> >>> I've seen some hints of provenance around, but I'd like to tackle the
> problem a little bit deeper.
> >>> I believe that there are at least two provenance issues, that are
> related each other and that probably need a standardized handling:
> >>> - if a CSV file is obtained from a spreadsheet, it's likely that one
> or more 'cells' result from formulas applied to other cells in the same
> CSV. Probably (a simplified version of) PROV is a good candidate to
> represent such relations? If I'm not wrong, there was some related
> discussion floating around in the chat two telcos ago (about "sum" cells?).
> >>> - also, the whole CSV file may be the result of a specific process,
> especially if it represents a DB dump and/or the result of a computation.
> It would be useful to be able to annotate these files with their provenance.
> >>>
> >>> I'm not sure if this is in the scope of the working group, but I
> believe that at least part of it is.
> >>> Cheers,
> >>>
> >>> Davide
> >>>
> >>>
> >
> >
>
>

Received on Friday, 28 February 2014 09:23:36 UTC