Re: [dxwg] Dataset series (#868) from makxdekkers via GitHub on 2019-09-20 (public-dxwg-wg@w3.org from September 2019)

From: makxdekkers via GitHub <sysbot+gh@w3.org>
Date: Fri, 20 Sep 2019 17:58:56 +0000
To: public-dxwg-wg@w3.org
Message-ID: <issue_comment.created-533653087-1569002334-sysbot+gh@w3.org>

> ... Maybe this can be compensated by smart designs in a data portal that understands these relations, but that is certainly making life more complicated for them.

It seems to me that, if files with different data are modelled as distributions of one dataset, the data portal also has to be smart enough to understand the relationship between the distributions. In both cases, you need the logic to make sense of the structure.
Personally, I find a model that says that all Distributions have the same data easier to understand, and easier to program for, than a model that says that Distributions may have the same data but they also may not -- and the system needs to figure out which is which. But smart programming gets around any and all obstacles, I know!
 
> And if portal providers do not compensate for this, well, It is not going to make developers looking for datasets to use very happy to find an ever growing list of interconnected datasets rather than a single dataset with a list of files. As a developer myself, I feel the need to have a the "best" information model is making things impractical and awkward that will just scare away one (or several) of the main target groups of doing this in the first place.

I often find the argument that 'developers' can be 'scared away' by complexity a bit strange. Developers usually work for someone and have a job to deliver a product or service, and I am pretty sure that they are smart enough to build systems that work for their customers based on the data that is there. 

In my mind, it is a question of optimisation -- I think that the current DCAT model is optimised, or focused if you will, on the simpler situations, and not on complex cases (it's not the DCAT model that requires complex solutions, but the complexity is in the real world). Of course, we could decide at some point in time that the complex cases are the majority and therefore the model needs to be changed in order to optimise it for the complex cases, but we need to look at the evidence to decide that we are in that situation. And there always needs to be a balance -- we don't want to optimise the model for time series and make life harder for people with other types of relationship between data files.


-- 
GitHub Notification of comment by makxdekkers
Please view or discuss this issue at https://github.com/w3c/dxwg/issues/868#issuecomment-533653087 using your GitHub account

Received on Friday, 20 September 2019 17:58:57 UTC