- From: Karen Coyle <kcoyle@kcoyle.net>
- Date: Tue, 26 Sep 2017 13:18:13 -0700
- To: public-dxwg-wg@w3.org
Thanks to Andrea's email, ... see below On 9/26/17 10:39 AM, Karen Coyle wrote: > Here's a (much) more coherent statement of something I started to say > during the meeting yesterday but didn't have my thoughts together. > > I created use case 47[1] because I felt that there is an unspoken > assumption behind the discussion of "versions" - which is that each > version is a complete replacement for the previous one(s). That is how I > read the statement about the version delta: "indicating the "type" of > change (addition/removal/update of data etc.)"[2] The implied subject if > that is a single dataset that has been changed. If that is the case, > then we can use "version" in that way. However, there are other > situations that are not captured by that definition but that will arise > in practice. > > The example I gave in use case 47 is one in which there is a master > dataset, and that additions and changes to that dataset are issued in > transaction files. A transaction file will have a newer date (or some > other sequential numbering), but it is not a "version" of the master > file; instead, it must be applied to the master file to create a new > master file. > > This is only one kind of update. There are also sequential datasets that > may or may not be stand-alone. That is analogous to the issues of a > serial publication. This may include periodic datasets like census > information - each new census provides new information, but would we > call a later census file a version of an earlier one? DWBP says: "In general, multiple datasets that represent time series or spatial series, e.g. the same kind of data for different regions or for different years, are not considered multiple versions of the same dataset. In this case, each dataset covers a different set of observations about the world and should be treated as a new dataset." This solves one aspect of the definition, which is that it treats datasets in series as separate datasets, and versions apply to "the same dataset" but presumably with changes that do not result in a new, separate dataset. There remains the question of whole-part relationships, and relationships intending to produce an updated master file, as well as perhaps defining a concept like "stand-alone" to manage dependencies. (And perhaps other conditions I'm not aware of.) kc [1] https://www.w3.org/TR/dwbp/#dataVersioning > > Use case 44 [3] (Identification of versioned datasets and subsets) is > also related to this question because it addresses the part/whole > relationship between datasets. Use case 32 [4] (Relationships between > datasets) has elements of this question as well, although it emphasizes > the type of derivation or part/whole relationship. > > It may be best to make a clear separation between versions of a dataset > and related datasets that are not one-to-one replacements for another. > If nothing else, our definition of versions needs to make clear what > types of relationships are included in the declaration that one dataset > is a version of another. This is what I mainly find to be missing. > > kc > [1] https://w3c.github.io/dxwg/ucr/#ID47 > [2] https://lists.w3.org/Archives/Public/public-dxwg-wg/2017Sep/0051.html > [3] https://w3c.github.io/dxwg/ucr/#ID44 > [4] https://w3c.github.io/dxwg/ucr/#ID32 > > -- Karen Coyle kcoyle@kcoyle.net http://kcoyle.net m: 1-510-435-8234 (Signal) skype: kcoylenet/+1-510-984-3600
Received on Tuesday, 26 September 2017 20:18:38 UTC