- From: Makx Dekkers <mail@makxdekkers.com>
- Date: Fri, 14 Aug 2015 20:55:36 +0200
- To: <public-dwbp-wg@w3.org>
Erik wrote: > one person's model/reality is another person's data. trying to understand > where to draw the line is a futile attempt with a long history of trying and > failing. So maybe the reason we have never managed to decide what we mean by 'data' is because it is not possible to define it and therefore our attempts have been futile. Good point. Maybe we need to look at it from a different angle. Here is what I think could maybe be a way forward. Someone mentioned the word 'context' in another thread, and maybe that is what we need to look at. One way of looking at context is how DCAT defines 'dataset': "A collection of data, published or curated by a single agent, and available for access or download in one or more formats". So not individual observations, sentences, numbers, but data items that belong together in some sort of 'collection'. My proposal would be not to try to define limits related to what the data *is* or how it can be used but just to consider the context in which the data exists or is embedded. If the context puts the data in some sensible perspective, it's in scope; if it is just bits and pieces without a clear context, it's out of scope. Here are two examples that I imagined: 1. metereological information * 31 degrees Celsius is just a temperature; * The fact that 31 degrees Celsius was the maximum temperature today in the village where I am is a piece of information. My assumption is that this level is not what we want to be concerned with in this group. I think that we start getting interested if there is a collection of those pieces of information, for example a list of today's maximum temperatures across the whole province or country, or in a bigger context, when this is part of the list of all maximum temperatures across the country for all days of the year. As far as I understand, such lists are what DCAT would call 'datasets'. 2. legal information * A single sentence is just that; * A legal article with some sentences is a piece of information. Again, not the kinds of things that we're concerned with. As soon as the articles are embedded in a complete legal act with definitions and references, then it becomes again "a collection of data, published or curated by a single agent, and available for access or download in one or more formats" (a dataset) and therefore of interest to us. Happy to hear people's views on this. Makx.
Received on Friday, 14 August 2015 18:56:11 UTC