W3C home > Mailing lists > Public > public-dwbp-wg@w3.org > November 2014

RE: ISSUE-80: We need a definition of "dataset"

From: Makx Dekkers <mail@makxdekkers.com>
Date: Fri, 14 Nov 2014 14:18:15 +0100
To: "'Ed Staub'" <estaub2@comcast.net>, <public-dwbp-wg@w3.org>
Message-ID: <001301d0000d$73b15240$5b13f6c0$@makxdekkers.com>

In my mind, there is nothing that would prevent people to use DCAT for a
collection of unrelated data, and I don't think we want to tell them
they can't. Also, it would depend on someone's perspective on what
constitutes 'related'.

Again, my position is that the definition of dataset in DCAT is good
enough, and that we should not spend time in trying to make it better.


> -----Original Message-----
> From: Ed Staub [mailto:ed.staub@semanterra.org] On Behalf Of Ed Staub
> Sent: Thursday, November 13, 2014 5:11 AM
> To: public-dwbp-wg@w3.org
> Subject: Re: ISSUE-80: We need a definition of "dataset"
> Note that the RDF Data Cube vocabulary has a different definition of
> "dataset" than DCAT:
> "Represents a collection of observations, possibly organized into
> various
> slices, conforming to some common dimensional structure."
> Assuming the DCAT definition is used, I think it useful to make clear
> that a
> "common dimensional structure" is not implied.  FWIW, my prior
> experience
> led me to assume the "common dimensional structure" meaning for DCAT
> until I
> dug into the DCAT spec.
> On the "too-broad" side, there probably are collections of data
> published or
> curated by a single agent that are larger than is intended by this
> definition.  In particular, I agree with Bernadette Lóscio in thinking
> that
> the collection's content should be related - not "a random assortment
> of
> data".  As an extreme example, imagine the entire content of
> datahub.io
> described as a single dataset!
> So... I'd suggest adding the word "related":
> "A related collection of data, published or curated by a single agent,
>    ^^^^^^^
> and available for access or download in one or more formats."
> The addition of "related" deals with both concerns at once; it would
> be
> strange and tautological to require all the data in a single cube to
> be
> "related".
> -Ed Staub
Received on Friday, 14 November 2014 13:18:46 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:39:28 UTC