W3C home > Mailing lists > Public > public-egov-ig@w3.org > October 2009

Publishing Open Government Data - Response to First Public Working Draft

From: Chris Beer <chris-beer@grapevine.net.au>
Date: Tue, 20 Oct 2009 23:37:00 +1100
Message-ID: <4ADDAEEC.5080703@grapevine.net.au>
To: W3C e-Gov IG <public-egov-ig@w3.org>
Hi all

Well, I've read over this a few times now, and here's the beginning of a 
response from me - some suggestions, maybe some edits. However, I'm not 
comfortable at this point in my very early e-Gov IG "life" in telling 
people exactly what to write, let alone making arbitary changes to the 
wiki - this isn't wikipedia after all ;-) . I know how I'd do it at any 
given time, but most of you here are far more learned that I at this 
game. So I'll stick to suggestions for the moment unless I'm told to go 
open slather :) So here we go (You'll get used to how I write to the 
list by the way - I write how I talk.)

I have an initial question regarding the intended audience of this 
document. On the wiki version, I note it states:

*Data.gov.* Memo
*From: W3C eGovernment Interest Group <http://www.w3.org/2007/eGov/IG/>
To: Any government wishing to set-up data.gov.*

This combined with the abstract tells me the current intention of the 
document is as a pitch to Government. But thats about it. WHO in 
Government is my question. Are we talking CIO level? Are we talking 
policy advisors without a technical background who have been told to 
find out about possibilities. Are we talking about senior public 
servants with an intrinsic fear of all that is new? Cutting edge young 
bucks fresh out of university, eager to change the world into 2.0 e-Gov 
paradise? Journalists with no IT background writing a story?

This document is essentially a pitch. /"To help governments open and 
share their data, the W3C eGov Interest Group has developed the 
following guidelines. These straightforward steps emphasize standards 
and methodologies to encourage publication of government data, allowing 
the public to use this data in new and innovative ways."/ These 
guidelines need to be concise yet unambiguous, they need to be 
understandable by any level of ICT/IT competence, and they need to 
promote the IG as a source of reliable information on the matter.

Case in point - if we are to refer to our trusty wikipedia for want of a 
better souce, only around 18% of the world speaks English. Only 46% of 
the EU speaks English, either natively or otherwise. This in itself 
invites us to write our document at a basic 8 year old's reading level, 
just as most print media is written. And not every country has a level 
of ICT/IT use the level of the US, Australia, Japan or Western Europe. 
However, of the roughly 200 national Goverments, and myriad of State and 
Local ones, not to mention individual agencies, it is the developing and 
third world Nations with low rates of ICT/IT use, that conversely have 
the least amount of red tape, and thus obstacles on a policy level, to 
outrightly implementing guidelines we make here. They also often have a 
limited budget, and open source, open standard solutions on the issue at 
hand are likely to be appealing to them.

So lets start with the big one - Government Data. What does that mean? 
There is no entry against "Data" in the glossary (only entries that 
assume a knowledge of how you are defining "data"). The abstract points 
to data.gov, and from that link and a bit more reading you might start 
to learn that we're talking about datasets, statistics etc, not "data" 
as in non-computing terms - "information on a computer". I think a 
little spiel on what exactly is meant by Government Data could go a long 
way. The document throughout seems to jump from Data to Dataset to 
Information, often seeming to use the terms interchangebly.

Once we've covered that definition off, we move onto the meat of the 
document. Rule number one should of course be acronyms for any technical 
document like this. Always write a long hand version, then the acronym 
in brackets, after which we can use the acronym all we like. Are we to 
assume that Mr. Senior Public servant knows what XML or CSV is? Do we 
run the risk of losing his engagement in the document in the first 
paragraph? Inline links to definitions wouldn't be bad either (eg: link 
XML to http://en.wikipedia.org/wiki/XML rather than the W3C Rec. Those 
who know what XML is won't need the link. Those who don't will want a 
simple explanation.) I know this seems simplistic, but I'd point to the 
charter, noob that I am: Education and Outreach.

Its in the meat of the document that the Wiki version (which is meant to 
the most up to date version?) and the http://www.w3.org/TR/gov-data/ 
version really diverge. In a sense the wiki version is far more open and 
comfortable in its language than the http://www.w3.org/TR/gov-data/ 
version, but with less useful detail.

The problem I see with the initial claim of "Open Data Gives Real Power 
to the People" or 'The real power comes when you put your data on the 
web" is it raises the question in the mind of Government of "who does it 
give power to?"

Democratic Government by nature is transient - any one regime or party 
is always at the mercy of public opinion and the spin doctoring of the 
opposition. While it would be nice to think that only academia and 
research areas of other agencies are going to take datasets and create 
mashups or conduct analysis for the good of all, Governments are always 
going to be wary of releasing raw data that could be manipulated in such 
as way as to cast they or thier policies in an unfavourable light. State 
Governments will be wary of showing data that results in a decrease in 
funding. The old paradigm and mindsets of how Government in a Western 
nation operates (ie: spend all the money, and give us some reasonable 
justification for it and you'll get more. Don't spend it all, and you 
obviously didn't need that much - next time you'll get less. Answer to 
problem, spend any extra money on things we don't really need - like 
expensive ad campaigns, or consultants who charge by the hour or fact 
finding missions overseas etc). Jaded? Prehaps. Realistic? I think there 
is a degree of reality to that point of view.

So I think we need to do one of two things here:

a) Keep the existing format of the document, but we really expand this 
area - Why should Government make its Data open? And I mean that 
literally - Government data could just as easily be published using 
these guidelines and open standards, but with controlled access. 
Explain, or certainly give case studies where it has resulted in long 
lasting benefit to Government, not just the people, why Open Data is the 
way forward.

b) Make this document a second stage one: Remove comments with 
unreferenced bias such as "Open Data Gives Real Power to the People" and 
lead the entire document in with something like "So you've decided to 
make the move to publishing your Government Data openly - What now?" - 
that is, assume that those reading the document have already made the 
decision, or (hint?) read another IG document that covers that topic 
(The whys around publishing Government Data openly) that then would 
logically lead on to this one.

"Choosing What Data To Publish"

I have to return to my opening comments about defining "Data" here. Are 
we only talking Datasets aka Data (including "static" like Laws etc), 
not Information (mission, vision, value, goal, and objective statement 
are technically Information aka "Web Content")? For the document to 
really flow well, we need to make this clear, and make it clear that 
there is a difference between what is real (measurable data - "The 
Organisation has x employees, y locations and z budget this year spent 
on a, b, and c") and what is transient (the Organisation under the d 
Government has this as its current mission, vision etc) and open to 
interpretation, as opposed to analysis. If we are talking Information, 
then the opening statement around "the easiest way to make data 
available on the Internet" should really read something akin to "copy 
and paste a txt file onto a web page using FTP". Hell, bit more work in 
a WYSIWYG editor and pass the it through a HTML validator and it counts 
as well structured - It has <html> tags, the <head> tags, then <body> 
tags in its simplest form. The second we are in the realm of Info 
instead of Data, we're out of the "Web as fileserver" concept. From that 
point on its all marketing, UX/UI etc.

"Learn More" - where is the link to our mailing list? How we can help 
etc? We don't provide any way for people to contact us! Assume that 
people are not intuitive and won't put the address in the "Status of 
this document" area and the comments in "Learn More" together...

That'll do to start with from me - I've already banged on too long I 
fear. I'll take no offence if I've misunderstood the kind of 
comments/feedback the group is after for something like this - its my 
first time comment on something W3C after all :)

Cheers

Chris Beer
Canberra, Australia
**
Received on Tuesday, 20 October 2009 12:37:31 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 20 October 2009 12:37:31 GMT