- From: Rob Atkinson <rob@metalinkage.com.au>
- Date: Tue, 10 Oct 2017 22:02:55 +0000
- To: kcoyle@kcoyle.net, andrea.perego@ec.europa.eu, rob@metalinkage.com.au, Simon.Cox@csiro.au
- Cc: public-dxwg-wg@w3.org, makx@makxdekkers.com
- Message-ID: <CACfF9LzD3QZGmpM6ZaY2wcGwZtWFp0ufgadM6YJmb5n+ACaQQQ@mail.gmail.com>
Another way to look at versioning is via profile... i.e. a profile can define a versioning approach for the profile. Of course, there are many domains that might want to share a versioning approach.... Given these requirements... we can dive into potential solution space for the purposes of comparison with with existing requirements for profiles: So if we have profiles being basically constraint sets, and each dcat profile potentially inheriting many such constraint sets - then either we declare all the constraints per dcat object (painful) or we have transitive inheritance (declare a profile which in turn declares its inherited constraints). These approaches are not mutually exclusive, but i think if the requirement that a profile is available in a fully expressed form (all transitive properties declared) then dereferencing a profile identifier should be enough to understand what type of versioning is being used. So, profiles can define a smallish number of common versioning models, re-use these in different application domains. So I think we have the "rich model" of versioning delegated to domains covered, and we can focus on those few simple cases. Having a requirement for lexical ordering and identity comparison of version identifiers might be one implication. Rob On Wed, 11 Oct 2017 at 02:48 Karen Coyle <kcoyle@kcoyle.net> wrote: > All, thanks for this discussion. It is very useful. The DCAT-AP link is > well worth a look. We haven't discussed what documentation might > accompany DCAT 1.1 yet, but perhaps it would be helpful to consider > that. It would give us a "parking lot" where we could place complex > questions that we are stumbling on. Further on we could look at "parked" > issues and see if they have a place in one of our deliverables or need > some separate documentation. > > It appears that although "versioning" was the obvious lack in DCAT 1.0, > it also is a particularly difficult space. > > 1. Does it make sense to continue this work now, or should we tackle > some other requirements? > > 2. Would a small "version sub-group" help? That latter could come back > to the full group with a proposal. Does anyone want to volunteer for that? > > kc > > On 10/10/17 1:14 AM, andrea.perego@ec.europa.eu wrote: > > I also agree with Simon and Rob that we cannot be prescriptive about > > what a "version" is and how it is identified. > > > > > > > > Restating Simon's point, I think we are dealing with a notion – as the > > one of "dataset" – which is used with different meanings by different > > communities - and they know exactly what a "version" is. Moreover, what > > a "version" is also very much related to the data management policy / > > workflow in place. And this affects how different versions of a dataset > > are modelled. > > > > > > > > It might be useful to have a look at the discussion on this topic > > carried out in the DCAT-AP WG, that highlighted quite a few different > > perspectives – and coming up with an agreement turned out to be quite > > problematic. This issue was further discussed during the work on the > > implementation guidelines of DCAT-AP, and the result was not to define > > what is or is not a version, but rather an explanation of different > > possible ways of modelling it, based on implementation evidence. The > > summary is available here: > > > > > > > > https://joinup.ec.europa.eu/release/dcat-ap-how-model-dataset-series > > > > > > > > As you can read there, we have examples where different versions of a > > dataset are modelled with distributions, or as different datasets in a > > series, possibly in combination with a statement saying which is the > > previous / next version (by using dct:hasVersion / dct:isVersionOf, > > respectively). And we have also to consider cases when datasets are > > updated (on a regular or irregular basis) but the old versions are not > > maintained (this frequently happens, e.g., for datasets updated daily). > > > > > > > > I think the lesson learnt in DCAT-AP is that what users are looking for > is: > > > > > > > > 1. Having guidance on how to model dataset versions (i.e., with > > different datasets, different distributions, etc.), based on evidence > > from similar use cases / domains. This requirement mainly applies to > > communities where the notion of dataset "version" is not established / > > clearly defined. > > > > > > > > 2. Having clear information on which are the relevant terms (classes, > > properties) in DCAT, and on how to use them. This requirement apply to > > all users. > > > > > > > > > > > > About point (2), I take this opportunity to add a note here - also about > > some of them that I'm not sure have been mentioned so far in our > discussion: > > > > > > > > - dct:modified [1] and dct:accrualPeriodicity [2]: These properties > > provide implicit information about a dataset version – especially when > > combined with the issue and/or creation date –, that can be used also > > when old versions are not maintained. > > > > > > > > - About the issue raised by Rob about previous/next/current version, > > dct:hasVersion [3] and dct:isVersionOf [4] are actually meant to model > > exactly previous / next versions. Moreover, there is also adms:prev [5] > > and adms:next [6], plus adms:last [7] for the latest version (@Rob, I'm > > not sure if with "current" version you actually mean this). > > > > > > > > > > > > Cheers, > > > > > > > > Andrea > > > > > > > > ---- > > > > [1] http://dublincore.org/documents/dcmi-terms/#terms-modified > > > > [2] http://dublincore.org/documents/dcmi-terms/#terms-accrualPeriodicity > > > > [3] http://dublincore.org/documents/dcmi-terms/#terms-hasVersion > > > > [4] http://dublincore.org/documents/dcmi-terms/#terms-isVersionOf > > > > [5] https://www.w3.org/TR/vocab-adms/#adms-prev > > > > [6] https://www.w3.org/TR/vocab-adms/#adms-next > > > > [7] https://www.w3.org/TR/vocab-adms/#adms-last > > > > > > > > ---- > > > > Andrea Perego, Ph.D. > > > > Scientific / Technical Project Officer > > > > European Commission DG JRC > > > > Directorate B - Growth and Innovation > > > > Unit B6 - Digital Economy > > > > Via E. Fermi, 2749 - TP 262 > > > > 21027 Ispra VA, Italy > > > > > > > > https://ec.europa.eu/jrc/ > > > > > > > > ---- > > > > The views expressed are purely those of the writer and may > > > > not in any circumstances be regarded as stating an official > > > > position of the European Commission. > > > > > > > > *From:*Rob Atkinson [mailto:rob@metalinkage.com.au] > > *Sent:* Tuesday, October 10, 2017 6:19 AM > > *To:* Simon.Cox@csiro.au; kcoyle@kcoyle.net; public-dxwg-wg@w3.org > > *Subject:* Re: Relating versions and UC47 (Define update method) > > > > > > > > > > > > > > > > +1 We cannot be prescriptive about what constitutes a version, nor how > > a version identifier is represented. > > > > > > > > What we can be prescriptive about are how versions are identified - i.e. > > the name of DCAT properties that refer to versions of a DCAT Dataset > > description, the dataset described by this description and version of > > DCAT Distribution. > > > > > > > > We can also require that identifiers are lexically comparable, so that > > if A is lexically > B then the version denoted by A is later than the > > version denoted by B. (and if A = B then version is the same) > > > > > > > > If a version designator is a URI, it could dereference to a "model" - > > however DCAT profiles could use third party vocabularies to define > > properties for such models, and have a simple string property in DCAT > > core. > > > > > > > > We probably need special properties in DCAT to handle > > "previous/next/current version" problems. > > > > > > > > Which leaves open whether we need another special property to indicate > > the type of version, and a set of defined literals for common cases. > > > > > > > > Any statistics about change should be through a deferenceable version > > model, defined by the application domain. > > > > > > > > <descends into solution space...> > > > > > > > > IMHO its important we have one consistent pattern for these types of > > situations where we promote some special semantics to dcat properties, > > but also want to use dcat Classes to act as subjects for discovery of > > domain-specific properties. > > > > > > > > The pattern seems to be a combination of simple DataProperties for DCAT > > core properties, and extension points using defined ObjectProperties > > whose type is controlled by domain profiles. Such ObjectProperties may > > be canonically defined in DCAT, or external vocabularies also defined by > > domain profiles. Do we want a simple pattern: > > > > > > > > dcat:prop a owl:DataProperty > > > > > > > > dcat:propLink a owl:ObjectProperty > > > > > > > > > > > > > > > > Rob Atkinson > > > > > > > > On Tue, 10 Oct 2017 at 14:31 <Simon.Cox@csiro.au > > <mailto:Simon.Cox@csiro.au>> wrote: > > > > I'm trying to not get sucked into the versioning discussion, but > > feel the need to draw attention to this work from Research Data > > Alliance, who two years ago developed guidelines on a very closely > > related topic - citation of dynamic datasets - i.e. how to identify > > a particular state of a dataset that is being continuously updated. > > The main link is here > > > https://www.rd-alliance.org/group/data-citation-wg/outcomes/data-citation-recommendation.html > > and there is a longer paper here: > > > https://www.rd-alliance.org/system/files/documents/TCDL-RDA-Guidelines_160411.pdf > > > > Seems to me that the notion of 'version' is usually a publisher's > > choice to assign a memorable identifier to a product, which may have > > many more intermediate changes from the last 'version'. Version > > control systems talk about 'tags' and 'releases' which are usually > > along a more-or-less continuous development path. Criteria for > > versions will vary depending on the application. There is no way we > > can be prescriptive on this, except for the requirement for > > transparency from the publisher, so perhaps the focus should be on a > > framework for enabling a publisher to describe their criteria, with > > the various concerns that apply. > > > > The key concern of the RDA work was to support the retrieval of any > > previous state (though not necessarily instantaneously). > > > > Simon > > > > -----Original Message----- > > From: Karen Coyle [mailto:kcoyle@kcoyle.net <mailto: > kcoyle@kcoyle.net>] > > Sent: Wednesday, 27 September, 2017 03:39 > > To: public-dxwg-wg@w3.org <mailto:public-dxwg-wg@w3.org> > > Subject: Relating versions and UC47 (Define update method) > > > > Here's a (much) more coherent statement of something I started to > > say during the meeting yesterday but didn't have my thoughts > together. > > > > I created use case 47[1] because I felt that there is an unspoken > > assumption behind the discussion of "versions" - which is that each > > version is a complete replacement for the previous one(s). That is > > how I read the statement about the version delta: "indicating the > > "type" of change (addition/removal/update of data etc.)"[2] The > > implied subject if that is a single dataset that has been changed. > > If that is the case, then we can use "version" in that way. However, > > there are other situations that are not captured by that definition > > but that will arise in practice. > > > > The example I gave in use case 47 is one in which there is a master > > dataset, and that additions and changes to that dataset are issued > > in transaction files. A transaction file will have a newer date (or > > some other sequential numbering), but it is not a "version" of the > > master file; instead, it must be applied to the master file to > > create a new master file. > > > > This is only one kind of update. There are also sequential datasets > > that may or may not be stand-alone. That is analogous to the issues > > of a serial publication. This may include periodic datasets like > > census information - each new census provides new information, but > > would we call a later census file a version of an earlier one? > > > > Use case 44 [3] (Identification of versioned datasets and subsets) > > is also related to this question because it addresses the part/whole > > relationship between datasets. Use case 32 [4] (Relationships between > > datasets) has elements of this question as well, although it > > emphasizes the type of derivation or part/whole relationship. > > > > It may be best to make a clear separation between versions of a > > dataset and related datasets that are not one-to-one replacements > > for another. > > If nothing else, our definition of versions needs to make clear what > > types of relationships are included in the declaration that one > > dataset is a version of another. This is what I mainly find to be > > missing. > > > > kc > > [1] https://w3c.github.io/dxwg/ucr/#ID47 > > [2] > > > https://lists.w3.org/Archives/Public/public-dxwg-wg/2017Sep/0051.html > > [3] https://w3c.github.io/dxwg/ucr/#ID44 > > [4] https://w3c.github.io/dxwg/ucr/#ID32 > > > > > > -- > > Karen Coyle > > kcoyle@kcoyle.net <mailto:kcoyle@kcoyle.net> http://kcoyle.net > > m: 1-510-435-8234 (Signal) > > skype: kcoylenet/+1-510-984-3600 <+1%20510-984-3600> > <tel:+1%20510-984-3600> > > > > -- > Karen Coyle > kcoyle@kcoyle.net http://kcoyle.net > m: 1-510-435-8234 (Signal) > skype: kcoylenet/+1-510-984-3600 <+1%20510-984-3600> > >
Received on Tuesday, 10 October 2017 22:04:23 UTC