W3C home > Mailing lists > Public > public-dwbp-wg@w3.org > February 2014

Data on the Web scope issue

From: Makx Dekkers <mail@makxdekkers.com>
Date: Thu, 13 Feb 2014 21:18:26 +0100
To: <public-dwbp-wg@w3.org>
Message-ID: <001001cf28f8$c1971c20$44c55460$@makxdekkers.com>
Dear all,


In last week's meeting we had some discussion about the scope of the
working group. The issue that I brought up was: what do we mean by
"data"? As far as I understand, the group is not a-priori limited but I
think we may want to think about distinguishing data in several


Here a set of dimensions we may want to look at, and maybe choose from:




*         Base registers, e.g. addresses, vehicles, buildings;

*         Business information, e.g. patent and trademark information,
public tender databases;

*         Cultural heritage information, e.g. library, museum, archive

*         Geographic information, e.g. maps, aerial photos, geology;

*         Infrastructure information, e.g. electricity grid,
telecommunications, water supply, garbage collection;

*         Legal information, e.g. supranational (e.g. EU) and national
legislation and treaties, court decisions;

*         Meteorological information, e.g. real-time weather information
and forecasts, climate data and models;

*         Political information, e.g. parliamentary proceedings, voting
records, budget data, election results;

*         Social data, e.g. various types of statistics (economic,
employment, health, population, public administration, social); 

*         Tourism information, e.g. events, festivals and guided tours;

*         Transport information, e.g. information on traffic flows, work
on roads and public transport.




*         Data that must be provided to the public under a legal
obligation, e.g. legislation, parliamentary and local council
proceedings (dependent on specific jurisdiction);

*         Data that is a (by-)product of the public task, e.g. base
registers, crime records.




*         Data that supports democracy and transparency;

*         Data that is the basis for services to the public;

*         Data that has commercial re-use potential.




*         Authoritative, clean data, vetted and guaranteed;

*         Unverified or dirty data.


Size (ranging from small CSV files of less than a megabyte to
potentially tera- or petabytes of sensor or image data)




*         Text, e.g. legislation, public announcements, public

*         Image, e.g. aerial photos, satellite images;

*         Video, e.g. traffic and security cameras;

*         Tabular data, e.g. statistics, spending data, sensor data
(such as traffic, weather, air quality).


Rate of change:


*         Fixed data, e.g. laws and regulations, geography, results from
a particular census or election;

*         Low rate of change, e.g. road maps, info on buildings, climate

*         Medium rate of change, e.g. timetables, statistics;

*         High rate of change, e.g. real-time traffic flows and airplane
location, weather data.


In terms of Best Practices, the last three dimensions (size, type/format
and rate of change) may require different sets of best practices -
publishing real-time traffic flow data may require different processes
and technologies than publishing the results of a census or next year's
public budget. The other dimensions may not need different best
practices but maybe they could serve as topics in use cases?





Makx Dekkers

 <mailto:makx@makxdekkers.com> makx@makxdekkers.com

+34 639 26 11 46


Received on Thursday, 13 February 2014 20:19:01 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:24:06 UTC