Re: Question for DCAT "experts"

Hi there,

a colleague of mine, Marvin Frommhold, is researching versioning in the
context of RDF and Linked Data. He contributes the following points:

The following two documents provide a basic introduction to versioning
of datasets:

  * Papakonstantinou, Vassilis et al. “Versioning for Linked Data:
    Archiving Systems and Benchmarks.” BLINK@ ISWC.,
    2016. Web. <>
      o Section 2 of this paper provides an introduction of different
        archiving strategies.
  * Gray, Alasdair J. G. et al. “Dataset Descriptions: HCLS Community
    Profile.” Interest group note, W3C (May 2015) (2015): n. pag. Print.
      o A W3C Interest Group Note that, among other things, discusses
        requirements for dataset versioning.
      o "The Data Catalog Vocabulary (DCAT) [DCAT
        <>] is used to describe
        datasets in catalogs, but does not deal with the issue of
        dataset evolution and versioning."

He agrees that change sets are related to versioning in that a version
can be described as a set of changes. Fully realized, this allows very
granular tracking of dataset evolution. Makx point is important here:
These changes are granular descriptions about the evolving content of a
dataset, where DCAT so far does little to describe the data itself. If
DCAT started to describe the content and structure of the data, this
would be a considerable expansion of its scope.

The question if a set of changes constitute a new dataset or if a whole
database is a dataset is complicated to me, because I understand
instances of dcat:Dataset as conceptual descriptions of datasets,
largely independent of the structure of the underlying data. In that
sense, a database or a web service independent of the query can also be
datasets. Limiting the data retrieved from it by some API call or SQL
query could then create a new dataset fully contained in the first one.


Am 22/06/17 um 11:00 schrieb Makx Dekkers:
> Yes, I agree it is. Updating 'in place' is a case where the publisher decides that a change does not create a new Dataset. 
> I find Karen's suggestion to treat a 'database' as a 'dataset' interesting -- I have always thought of a database as closer to a dcat:Catalog.
> Makx.
> isn't a change set (like a diff) just a special case of versioning?
> As far as I remember from the initial work on DCAT, a Dataset is considered to be a kind of blob. Nothing is said about what goes on 'inside' a Dataset. The only thing you see on the outside is the modification date but you don't know what has changed inside. 
> Makx
> Many of you know DCAT quite well, and I'm new to it, so I'm taking the lazy way and directing this as a question to you.
> I see in DCAT that there are properties that define frequency and update dates. The update date is
> "Most recent date on which the dataset was changed, updated or modified."
> The library world has a number of databases that are updated "in place".
> For anyone receiving updates, the updates do not include the entire file, only those records added, changed, or deleted since some set time.
> Is this covered by DCAT? If not, I will add a use case and we can discuss.
