Re: [dxwg] Dataset series (#868) from matthiaspalmer via GitHub on 2019-09-20 (public-dxwg-wg@w3.org from September 2019)

From: matthiaspalmer via GitHub <sysbot+gh@w3.org>
Date: Fri, 20 Sep 2019 15:55:50 +0000
To: public-dxwg-wg@w3.org
Message-ID: <issue_comment.created-533612114-1568994948-sysbot+gh@w3.org>

@kcoyle yes I agree with the mirror case, to distinguish from that case you would need to provide additional triples like I described, alternatively just could just provide some information in the dct:description of the distribution. Not perfect but it works as long as we have humans looking at the metadata.

@jakubklimek well, as I see it, if I have to choose between:
1. an approach which is close to the RDF information model which requires just a slight comment on the current model of DCAT and only a few extra triples OR
2. creating separate datasets even though I do not consider them to be separate, maybe create a top-level "abstract dataset" to connect them all, use a reification construction in DCAT for creating relations between them, duplicate certain metadata to make findable and overall create a lot more triples.

I would go for option 1 every day and I am sure a lot of other people in the linked data world would argue the same way.

I cannot resist to throw another log on the bonfire here:
The datamodel is one thing, but I think it is important to consider the perspective of providing a good user experience in a data portal. For instance, we are helping a few organizations that have suppliers' ledgers (and other datasets) where additional files are added every month. If many datasets are realized as 30+ datasets (growing all the time), how would you find a nice overview or a starting point when you browse or search? Maybe this can be compensated by smart designs in a data portal that understands these relations, but that is certainly making life more complicated for them.

And if portal providers do not compensate for this, well, It is not going to make developers looking for datasets to use very happy to find an ever growing list of interconnected datasets rather than a single dataset with a list of files. As a developer myself, I feel the need to have a the "best" information model is making things impractical and awkward that will just scare away one (or several) of the main target groups of doing this in the first place.

--
GitHub Notification of comment by matthiaspalmer
Please view or discuss this issue at https://github.com/w3c/dxwg/issues/868#issuecomment-533612114 using your GitHub account

Received on Friday, 20 September 2019 15:55:51 UTC