Re: CSVs and provenance from Yakov Shafranovich on 2014-02-28 (public-csv-wg@w3.org from February 2014)

From: Yakov Shafranovich <yakov-ietf@shaftek.org>
Date: Fri, 28 Feb 2014 10:03:13 -0500
To: "Ceolin, D." <d.ceolin@vu.nl>
Cc: Eric Stephan <ericphb@gmail.com>, W3C CSV on the Web Working Group <public-csv-wg@w3.org>
Message-ID: <CAPQd5oRaBf_YjAu+kzxzx39evZbvHOpuh4ziS57f8U-_K9GTEw@mail.gmail.com>

I think the second one is more like the meta element "generator" in
HTML5 and the first would be the user agent in HTTP:

http://www.w3.org/TR/html5/document-metadata.html#meta

Yakov

On Fri, Feb 28, 2014 at 9:07 AM, Ceolin, D. <d.ceolin@vu.nl> wrote:
> I think that an HTTP user agent handles the request and delivery, but not necessarily the creation of the CSV.
> If I got it correctly, we can identify at least two activities (prov:Activity):
>
> - delivery: the activity of delivery of the CSV, that is attributed to an agent (the http user agent by default?). Can affect the rendering, etc.
> - generation: the activity of generation of the CSV. Determines the value contained in the file, etc.
>
> The two may share one or more element (e.g. the agent controlling them), but this is not mandatory.
> Did I miss anything?
>
> Davide
>
>
> Il giorno 28/feb/2014, alle ore 14.21, Yakov Shafranovich ha scritto:
>
>> Here is HTTP's definition in RFC 2616, section 14.3:
>>
>> https://www.ietf.org/rfc/rfc2616.txt
>>
>> Yakov
>>
>> On Fri, Feb 28, 2014 at 1:07 AM, Eric Stephan <ericphb@gmail.com> wrote:
>>> Yakov,
>>>
>>> Yes it does fit within the concept of provenance, and yes I think it
>>> would be good to capture.
>>>
>>> Cheers,
>>>
>>> Eric
>>>
>>> On Thu, Feb 27, 2014 at 8:23 PM, Yakov Shafranovich
>>> <yakov-ietf@shaftek.org> wrote:
>>>> I think something similar to the concept of "User-Agent" in HTTP or
>>>> email would be helpful. Knowing what software and version generated a
>>>> given CSV file would help to interpret it.
>>>>
>>>> Not sure if this fits within the concept of provenance.
>>>>
>>>> Yakov
>>>>
>>>> On Thu, Feb 27, 2014 at 2:39 PM, Ceolin, D. <d.ceolin@vu.nl> wrote:
>>>>> Hi Eric,
>>>>>
>>>>> I should have something, but not much. So yes please, that would be very helpful.
>>>>> Thanks,
>>>>>
>>>>> Davide
>>>>>
>>>>> Il giorno 27/feb/2014, alle ore 15.48, Eric Stephan ha scritto:
>>>>>
>>>>>> Davide,
>>>>>>
>>>>>> Great idea, I feel this is very important and a huge problem for
>>>>>> anyone who has to maintain a CSV and track changes.  I'd love to see a
>>>>>> use case on this.  If you need any help with a real world use case let
>>>>>> me know, there are plenty in the science arena.
>>>>>>
>>>>>>
>>>>>> Eric
>>>>>>
>>>>>> On Thu, Feb 27, 2014 at 1:01 AM, Ceolin, D. <d.ceolin@vu.nl> wrote:
>>>>>>> Hi all,
>>>>>>>
>>>>>>> I've seen some hints of provenance around, but I'd like to tackle the problem a little bit deeper.
>>>>>>> I believe that there are at least two provenance issues, that are related each other and that probably need a standardized handling:
>>>>>>> - if a CSV file is obtained from a spreadsheet, it's likely that one or more 'cells' result from formulas applied to other cells in the same CSV. Probably (a simplified version of) PROV is a good candidate to represent such relations? If I'm not wrong, there was some related discussion floating around in the chat two telcos ago (about "sum" cells?).
>>>>>>> - also, the whole CSV file may be the result of a specific process, especially if it represents a DB dump and/or the result of a computation. It would be useful to be able to annotate these files with their provenance.
>>>>>>>
>>>>>>> I'm not sure if this is in the scope of the working group, but I believe that at least part of it is.
>>>>>>> Cheers,
>>>>>>>
>>>>>>> Davide
>>>>>>>
>>>>>>>
>>>>>
>>>>>
>

Received on Friday, 28 February 2014 15:04:12 UTC