- From: Laufer <laufer@globo.com>
- Date: Sun, 16 Nov 2014 14:52:30 -0200
- To: Eric Stephan <ericphb@gmail.com>
- Cc: DWBP WG <public-dwbp-wg@w3.org>
- Message-ID: <CA+pXJiiM6YwSFGu2zY41B7mfdHFOq5T0ojYv4P6cYjuasWAUTA@mail.gmail.com>
Ok, Eric. Laufer 2014-11-15 13:04 GMT-02:00 Eric Stephan <ericphb@gmail.com>: > Laufer, > > >> I think that this issue is divided in 3 issues: > >>1 - the DWBP WG definition of dataset; > >>2 - the DCAT definition of dataset; > >>3 - the mapping of other data models to DCAT´s data model. > > Thank you for taking the time to outline your thoughts. > > I'm wondering, would it be helpful to resolve this issue if we just > addressed items 1 and 2 of your concerns? In other words, How do we has a > DWBP working group define a dataset (item 1), and is the reuse of DCAT > definition of a dataset sufficient (item 2)?" > > Once we come up with the DWBP working group definition of a dataset, then > I think it is appropriate to discuss how how the DWBP dataset needs to map > to the data models identified in the UCR (item 3). > > If we separate the definition of dataset and data model mapping as two > separate issues it might help us move forward. > > Eric S > > > > On Fri, Nov 14, 2014 at 7:00 AM, Laufer <laufer@globo.com> wrote: > >> Makx, >> >> I agree with you that DCAT´s definition is good. The problem I see is if >> with this definition DCAT could express (map) all other definitions using >> the current DCAT data model, including the DCAT definition of distribution >> (we must also define this term). And if our group should care if DCAT could >> do these mappings. As you also pointed, and I agree, the issue of >> inheritance is also very abroad and has different interpretations in >> different groups, and would be impossible to define the "best" inheritance >> schema. >> >> When, for example, a user uses a CKAN platform to publish data, the DCAT >> description instance is invisible for her. The CKAN platform will be the >> responsible for generating a DCAT instance that corresponds to the datasets >> and distributions published by the user. The same for other >> publishing/distributions platforms. Could CKAN maps its data model to >> DCAT´s data model? >> >> I think that this issue is divided in 3 issues: >> 1 - the DWBP WG definition of dataset; >> 2 - the DCAT definition of dataset; >> 3 - the mapping of other data models to DCAT´s data model. >> >> I agree that to our WG the better would be to not enter in this >> discussion and assume DCAT´s definition and not care about other issues. >> But I don't know if we can leave this thing without stating in our >> documents all this issues of the data on the web ecosystem. The fact, for >> me, is that in this ecosystem we have different definitions of dataset with >> different implementations related to these definitions. >> >> I think that our suggestions/recommendations of best practices should >> influence the publishing/distribution platforms, in a way that, in some >> sense, could create a common definition of dataset/distribution, maybe the >> DCAT one, or an extended version. >> >> Best Regards, >> Laufer >> >> 2014-11-14 11:18 GMT-02:00 Makx Dekkers <mail@makxdekkers.com>: >> >> Ed, >>> >>> In my mind, there is nothing that would prevent people to use DCAT for a >>> collection of unrelated data, and I don't think we want to tell them >>> they can't. Also, it would depend on someone's perspective on what >>> constitutes 'related'. >>> >>> Again, my position is that the definition of dataset in DCAT is good >>> enough, and that we should not spend time in trying to make it better. >>> (http://www.brainyquote.com/quotes/quotes/v/voltaire109643.html) >>> >>> Makx. >>> >>> >>> >>> > -----Original Message----- >>> > From: Ed Staub [mailto:ed.staub@semanterra.org] On Behalf Of Ed Staub >>> > Sent: Thursday, November 13, 2014 5:11 AM >>> > To: public-dwbp-wg@w3.org >>> > Subject: Re: ISSUE-80: We need a definition of "dataset" >>> > >>> > Note that the RDF Data Cube vocabulary has a different definition of >>> > "dataset" than DCAT: >>> > >>> > "Represents a collection of observations, possibly organized into >>> > various >>> > slices, conforming to some common dimensional structure." >>> > >>> > Assuming the DCAT definition is used, I think it useful to make clear >>> > that a >>> > "common dimensional structure" is not implied. FWIW, my prior >>> > experience >>> > led me to assume the "common dimensional structure" meaning for DCAT >>> > until I >>> > dug into the DCAT spec. >>> > >>> > >>> > On the "too-broad" side, there probably are collections of data >>> > published or >>> > curated by a single agent that are larger than is intended by this >>> > definition. In particular, I agree with Bernadette Lóscio in thinking >>> > that >>> > the collection's content should be related - not "a random assortment >>> > of >>> > data". As an extreme example, imagine the entire content of >>> > datahub.io >>> > described as a single dataset! >>> > >>> > >>> > So... I'd suggest adding the word "related": >>> > >>> > "A related collection of data, published or curated by a single agent, >>> > ^^^^^^^ >>> > and available for access or download in one or more formats." >>> > >>> > The addition of "related" deals with both concerns at once; it would >>> > be >>> > strange and tautological to require all the data in a single cube to >>> > be >>> > "related". >>> > >>> > >>> > -Ed Staub >>> > >>> > >>> >>> >>> >>> >> >> >> -- >> . . . .. . . >> . . . .. >> . .. . >> > > -- . . . .. . . . . . .. . .. .
Received on Sunday, 16 November 2014 16:52:59 UTC