RE: Data on the Web scope issue

Eric,

 

I like your additions. The only thing is that for me "Scientific
information" should be an addition, rather than a replacement for
"Meteorological information". Two reasons for that:

 

1.       Audience: scientific information in general is mostly by and
for scientists (e.g. datasets that scientific publications are based on
and could be re-used by other scientists to check conclusions or to
build on it for further research), while meteorological information is
usually for a wider audience (professional users, like airport
operators, and the wider public);

 

2.       Re-use potential: scientific information is mostly re-used in
the scientific community, while meteorological information has
substantial commercial re-use potential (and is already the basis for a
commercial provider market)

 

Of course, these differences are not black and white; there is overlap
between audiences and re-uses but I think there is sufficient difference
to keep them both.

 

Makx.

 

 

 

From: Eric Stephan [mailto:ericphb@gmail.com] 
Sent: Friday, February 14, 2014 3:35 AM
To: Makx Dekkers
Cc: Public DWBP WG
Subject: Re: Data on the Web scope issue

 

Makx,

 

I remember this question coming up and overall I really like this
breakdown.  Here are some suggested changes.

 

1) ORIGINAL:

*         Meteorological information, e.g. real-time weather information
and forecasts, climate data and models;

 

SUGGESTED REPLACEMENT:

*         Scientific information, e.g. observational, remote sensing,
instrument, informatics, visualization, analytics, and simulation   

 

 

2) SUGGESTED ADD TO QUALITY

Quality:

*         Corrected, Curated

*         Context (quality important to some groups and not others?)

 

3) SUGGESTED ADD TO RATE OF CHANGE:

Rate of change:

E.g.

*         Sensor streams (this is a catch all)

4) ADD: Data lifespan:

*         Forever

*         Data expiration  

 

Eric

 

On Thu, Feb 13, 2014 at 12:18 PM, Makx Dekkers <mail@makxdekkers.com
<mailto:mail@makxdekkers.com> > wrote:

Dear all,

 

In last week's meeting we had some discussion about the scope of the
working group. The issue that I brought up was: what do we mean by
"data"? As far as I understand, the group is not a-priori limited but I
think we may want to think about distinguishing data in several
dimensions.

 

Here a set of dimensions we may want to look at, and maybe choose from:

 

Domains:

E.g.

*         Base registers, e.g. addresses, vehicles, buildings;

*         Business information, e.g. patent and trademark information,
public tender databases;

*         Cultural heritage information, e.g. library, museum, archive
collections;

*         Geographic information, e.g. maps, aerial photos, geology;

*         Infrastructure information, e.g. electricity grid,
telecommunications, water supply, garbage collection;

*         Legal information, e.g. supranational (e.g. EU) and national
legislation and treaties, court decisions;

*         Meteorological information, e.g. real-time weather information
and forecasts, climate data and models;

*         Political information, e.g. parliamentary proceedings, voting
records, budget data, election results;

*         Social data, e.g. various types of statistics (economic,
employment, health, population, public administration, social); 

*         Tourism information, e.g. events, festivals and guided tours;

*         Transport information, e.g. information on traffic flows, work
on roads and public transport.

 

Obligation:

E.g.

*         Data that must be provided to the public under a legal
obligation, e.g. legislation, parliamentary and local council
proceedings (dependent on specific jurisdiction);

*         Data that is a (by-)product of the public task, e.g. base
registers, crime records.

 

Usage:

E.g.

*         Data that supports democracy and transparency;

*         Data that is the basis for services to the public;

*         Data that has commercial re-use potential.

 

Quality:

E.g.

*         Authoritative, clean data, vetted and guaranteed;

*         Unverified or dirty data.

 

Size (ranging from small CSV files of less than a megabyte to
potentially tera- or petabytes of sensor or image data)

 

Type/format:

E.g.

*         Text, e.g. legislation, public announcements, public
procurement;

*         Image, e.g. aerial photos, satellite images;

*         Video, e.g. traffic and security cameras;

*         Tabular data, e.g. statistics, spending data, sensor data
(such as traffic, weather, air quality).

*                Data streams, e.g. statistics, spending data, sensor
data (such as traffic, weather, air quality).

 

 

Rate of change:

E.g.

*         Fixed data, e.g. laws and regulations, geography, results from
a particular census or election;

*         Low rate of change, e.g. road maps, info on buildings, climate
data;

*         Medium rate of change, e.g. timetables, statistics;

*         High rate of change, e.g. real-time traffic flows and airplane
location, weather data.

 

 

In terms of Best Practices, the last three dimensions (size, type/format
and rate of change) may require different sets of best practices -
publishing real-time traffic flow data may require different processes
and technologies than publishing the results of a census or next year's
public budget. The other dimensions may not need different best
practices but maybe they could serve as topics in use cases?

 

Makx.

 

 


Makx Dekkers

makx@makxdekkers.com <mailto:makx@makxdekkers.com> 

+34 639 26 11 46 <tel:%2B34%20639%2026%2011%2046> 

 

 

 

Received on Friday, 14 February 2014 10:07:56 UTC