Exposing datasets with DCAT (partitioning, subsets..)

Hi all,

There has been a lot of discussion about subsetting data. I'd like to 
give a slightly different perspective which is purely motivated from the 
point of view of someone who wants to publish data, and in parallel 
someone who wants to discover and access that data without much hassle.

Of course it is hard to think about all scenarios, so I picked what I 
think are common ones:
- a bunch of static data files without any API
- an API without static data files
- both

And then some specific variations on what structure the data has (yearly 
data files, daily, or another dimension used as splitting point, such as 
spatial).

It is in no way final or complete and may even be wrong, but here is 
what I came up with:
https://github.com/ec-melodies/wp02-dcat/wiki/DCAT-partitioning-ideas

So it always starts by looking at what data exists and how it is 
exposed, and based on those constraints I tried to model that as DCAT 
datasets, sometimes with subdatasets. Again, it is purely motivated from 
a machine-access point of view. There may be other things to consider.

The point of this wiki page is to have something concrete to discuss 
about and not just abstract ideas. It should uncover problems, possibly 
solutions, perspectives... etc.

Happy to hear your thoughts,
Maik

Received on Tuesday, 2 February 2016 11:03:30 UTC