W3C home > Mailing lists > Public > public-dwbp-wg@w3.org > November 2014

Re: dwbp-ISSUE-94 (Git for data): Dataset versioning and dataset replication [Use Cases & Requirements Document]

From: Annette Greiner <amgreiner@lbl.gov>
Date: Tue, 18 Nov 2014 11:27:38 -0800
Message-Id: <8CE0D584-8E19-4881-9C87-D0BB98F48A55@lbl.gov>
To: DWBP Public List <public-dwbp-wg@w3.org>
+1
--
Annette Greiner
NERSC Data and Analytics Services
Lawrence Berkeley National Laboratory
510-495-2935

On Nov 18, 2014, at 10:48 AM, Yaso <yaso@nic.br> wrote:

> Hi all
> 
> I think we can address the ISSUE-94.
> 
> Hosting datasets on a Git repository can be one (good) way to provide
> provenance to track data but this is also true to wiki pages, for
> example. (it's not usual, but can be done in specific cases)
> 
> I propose that we agree that the working group can not recommend any
> tool, althought this do not excludes using the use cases that were
> raised at the discussions to raise requirements.
> 
> Anyone has any comment?
> 
> yaso
> 
> 
> 
> 
> 
> 
> On 11/13/14 11:49 AM, Augusto Herrmann wrote:
>> Hi.
>> 
>> Another good example of using git for data is the directory of public
>> bodies of governments all over the world that OKFn has been curating [1][2].
>> 
>> I agree with Annette's argument that tools on this field are rapidly
>> evolving, and the WG should probably not recommend a particular tool as a
>> BP at this pint.
>> 
>> Also relevant to this discussion is Max Ogden's `dat` tool, which intends
>> to be a 'git for data' [3][4]. Looks promising.
>> 
>> [1] http://publicbodies.org/
>> [2] https://github.com/okfn/publicbodies
>> [3] http://www.wired.com/2014/08/dat/
>> [4] https://github.com/maxogden/dat
>> 
>> Best regards,
>> Augusto Herrmann
>> 
>> On Wed, Nov 12, 2014 at 10:38 AM, Yaso <yaso@nic.br> wrote:
>> 
>>> 
>>> Em 11/11/14, 7:29 PM, Annette Greiner escreveu:
>>>> we need to draw a line between data mangement practices and data
>>> publishing practices.
>>> 
>>> Agree!
>>> 
>>> But it's a thin line. We can achieve this (possible) best practices
>>> either with a vocab or with a versioning document system (Git, HG even a
>>> wiki with yaml). I'm wondering if these items are not data management
>>> practices AND publishing practices...
>>> 
>>> track changes in data
>>> provide possibility to review the history of changes
>>> provide audit trail
>>> get access to whichever previous version of data, not only to most
>>> recent version
>>> 
>>> Agree about the "get dataset updates more efficiently" being a
>>> management practice only. for now, at least :-)
>>> 
>>> yaso
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> --
>>> Brazilian Internet Steering Committee - CGI.br
>>> W3C Brazil Office
>>> @yaso - yaso.eu
>>> 
>>> 55 11 5509-3537 (4025)
>>> skype: yasocordova
>>> 
>>> 
>> 
> 
> 
> -- 
> Brazilian Internet Steering Committee - CGI.br
> W3C Brazil Office
> @yaso - yaso.eu
> 
> 55 11 5509-3537 (4025)
> skype: yasocordova
Received on Tuesday, 18 November 2014 19:28:28 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:24:18 UTC