Re: Relating versions and UC47 (Define update method)

+1  We cannot be prescriptive about what constitutes a version, nor how a
version identifier is represented.

What we can be prescriptive about are how versions are identified - i.e.
the name of DCAT properties that refer to versions of a DCAT Dataset
description, the dataset described by this description and version of DCAT
Distribution.

We can also require that identifiers are lexically comparable, so that if A
is lexically > B then the version denoted by A is later than the version
denoted by B. (and if A = B then version is the same)

If a version designator is a URI, it could dereference to a "model" -
however DCAT profiles could use third party vocabularies to define
properties for such models, and have a simple string property in DCAT
core.

We probably need special properties in DCAT to handle
"previous/next/current version" problems.

Which leaves open whether we need another special property to indicate the
type of version, and a set of defined literals for common cases.

Any statistics about change should be through a deferenceable version
model, defined by the application domain.

<descends into solution space...>

IMHO its important we have one consistent pattern for these types of
situations where we promote some special semantics to dcat properties, but
also want to use dcat Classes to act as subjects for discovery of
domain-specific properties.

The pattern seems to be a combination of simple DataProperties for DCAT
core properties, and extension points using defined ObjectProperties whose
type is controlled by domain profiles. Such ObjectProperties may be
canonically defined in DCAT, or external vocabularies also defined by
domain profiles. Do we want a simple pattern:

dcat:prop a owl:DataProperty

dcat:propLink a owl:ObjectProperty



Rob Atkinson

On Tue, 10 Oct 2017 at 14:31 <Simon.Cox@csiro.au> wrote:

> I'm trying to not get sucked into the versioning discussion, but feel the
> need to draw attention to this work from Research Data Alliance, who two
> years ago developed guidelines on a very closely related topic - citation
> of dynamic datasets - i.e. how to identify a particular state of a dataset
> that is being continuously updated. The main link is here
>
> https://www.rd-alliance.org/group/data-citation-wg/outcomes/data-citation-recommendation.html
> and there is a longer paper here:
>
> https://www.rd-alliance.org/system/files/documents/TCDL-RDA-Guidelines_160411.pdf
>
> Seems to me that the notion of 'version' is usually a publisher's choice
> to assign a memorable identifier to a product, which may have many more
> intermediate changes from the last 'version'. Version control systems talk
> about 'tags' and 'releases' which are usually along a more-or-less
> continuous development path. Criteria for versions will vary depending on
> the application. There is no way we can be prescriptive on this, except for
> the requirement for transparency from the publisher, so perhaps the focus
> should be on a framework for enabling a publisher to describe their
> criteria, with the various concerns that apply.
>
> The key concern of the RDA work was to support the retrieval of any
> previous state (though not necessarily instantaneously).
>
> Simon
>
> -----Original Message-----
> From: Karen Coyle [mailto:kcoyle@kcoyle.net]
> Sent: Wednesday, 27 September, 2017 03:39
> To: public-dxwg-wg@w3.org
> Subject: Relating versions and UC47 (Define update method)
>
> Here's a (much) more coherent statement of something I started to say
> during the meeting yesterday but didn't have my thoughts together.
>
> I created use case 47[1] because I felt that there is an unspoken
> assumption behind the discussion of "versions" - which is that each version
> is a complete replacement for the previous one(s). That is how I read the
> statement about the version delta: "indicating the "type" of change
> (addition/removal/update of data etc.)"[2] The implied subject if that is a
> single dataset that has been changed. If that is the case, then we can use
> "version" in that way. However, there are other situations that are not
> captured by that definition but that will arise in practice.
>
> The example I gave in use case 47 is one in which there is a master
> dataset, and that additions and changes to that dataset are issued in
> transaction files. A transaction file will have a newer date (or some other
> sequential numbering), but it is not a "version" of the master file;
> instead, it must be applied to the master file to create a new master file.
>
> This is only one kind of update. There are also sequential datasets that
> may or may not be stand-alone. That is analogous to the issues of a serial
> publication. This may include periodic datasets like census information -
> each new census provides new information, but would we call a later census
> file a version of an earlier one?
>
> Use case 44 [3] (Identification of versioned datasets and subsets) is also
> related to this question because it addresses the part/whole relationship
> between datasets. Use case 32 [4] (Relationships between
> datasets) has elements of this question as well, although it emphasizes
> the type of derivation or part/whole relationship.
>
> It may be best to make a clear separation between versions of a dataset
> and related datasets that are not one-to-one replacements for another.
> If nothing else, our definition of versions needs to make clear what types
> of relationships are included in the declaration that one dataset is a
> version of another. This is what I mainly find to be missing.
>
> kc
> [1] https://w3c.github.io/dxwg/ucr/#ID47
> [2] https://lists.w3.org/Archives/Public/public-dxwg-wg/2017Sep/0051.html
> [3] https://w3c.github.io/dxwg/ucr/#ID44
> [4] https://w3c.github.io/dxwg/ucr/#ID32
>
>
> --
> Karen Coyle
> kcoyle@kcoyle.net http://kcoyle.net
> m: 1-510-435-8234 (Signal)
> skype: kcoylenet/+1-510-984-3600 <+1%20510-984-3600>
>
>

Received on Tuesday, 10 October 2017 04:20:05 UTC