W3C home > Mailing lists > Public > public-egov-ig@w3.org > November 2009

RE: VS: generic list of public data sources

From: Peristeras, Vassilios <vassilios.peristeras@deri.org>
Date: Sun, 1 Nov 2009 18:44:14 -0000
Message-ID: <6B017AD2AE2F6F489087FC986588136B08CCE326@EVS1.ac.nuigalway.ie>
To: <chris-beer@grapevine.net.au>
Cc: "Antti Poikola" <antti.poikola@gmail.com>, "Jonathan Gray" <jonathan.gray@okfn.org>, "Li Ding" <dingl@cs.rpi.edu>, "eGov IG" <public-egov-ig@w3.org>, "Rastas Taru" <Taru.Rastas@mintc.fi>
Hello Chris,

 

Very interesting ideas. 

Some comments.

 

If tags are specifically set by 

the public sector, there is always the possibility that the tags lose 

relevance to the user.

We discuss top-down (tags provided by the public sector) versus bottom-up (tags provided by people) tagging. 

The first approach keeps the fundamental idea that the producer has the right to arrange, define and classify the data they produce. The second challenges this idea and is aligned to Web 2.0 rhetoric, as it includes the clients in the picture and gives them the right to add their own metadata which at a second round can be used to create folksonomies in order to generate classification schemas bottom-up. So yes, we want to include people's voice in classification systems like this, but subjectivity, noisy and context-biased tags (e.g. "The dole") are shortcomings of this bottom-up (Web 2.0) approach. Cross-border issues are apparent where "borders" and not only national but could also involve cultural/linguistic etc aspects. 

Could we combine both approaches to take the best of each? Actually this seems to be what you propose. Not sure how, but looks a very interesting perspective.

 

its almost impossible to link data 

straight to government portfolio and business area

Unless you introduce a simple and rather straightforward guideline: the data producer (or collector) is the owner. Btw, this does not necessarily imply that the owner has the right to set the CRUD policies over this data, especially the "Read" part. E.g. a ministry may produce some data but who can access it, could be something to be decided higher e.g. at the cabinet/president level.

 

how all governments at all levels are actually structured at the top 

level in terms of portfolios, departments, ministries etc and see what 

patterns, if any, appear?

I would be surprised to see anything different to the classical functional differentiation (e.g. transportation, health, education, security, etc). Actually many management theories for decades now (NPM, BPR, TQM and Enterprise/Government 2.0 now) have advocated much towards a more "horizontal" organization of the public sphere (and of private enterprises) contrary to the vertical, hierarchical and stovepipe functional division which goes back to the 19th century. To the best of my knowledge, I am not aware of any real adoption of such radical (re-)organization at a large scale.

 

I think any public sector information on the 

internet could benefit in this regard

Agree 100%. I am just a bit worried on the feasibility due to the complexity this would involve. Nevertheless, I am very much interested to assist if you decide to go this way.

 

Best regards,
Vassilios

 

 

 

-----Original Message-----
From: Chris Beer [mailto:chris-beer@grapevine.net.au] 
Sent: 01 November 2009 13:08
To: Rastas Taru
Cc: Peristeras, Vassilios; Antti Poikola; Jonathan Gray; Li Ding; eGov IG
Subject: Re: VS: generic list of public data sources

 

 Hi all

 

I like where this discussion is going. I agree that tags will certainly 

offer flexibility, and probably should form the kernel of a system at a 

more specific level of use, however tagging in and of itself presents 

the problem of standards and taxomies, especially when looking at the 

problem from a cross-border or e-Government perspective.

 

The direction tagging is taking, as seen in the public eye, is for the 

public themselves to do the tagging, either explicitly or via search 

term mining by a hosting organisation. If tags are specifically set by 

the public sector, there is always the possibility that the tags lose 

relevance to the user. (An example from the Australian perspective would 

be Social Security Benefits paid to unemployed people. Standard public 

sector governance anywhere (I'm guessing) would tag such a dataset as 

"social security" or "social services". However 90% of the public in 

Australia would tag this as "The dole" - a commonly accepted nickname 

for "social security benefits."). The cross-border issues I see as being 

most immediate in a tagged based system is in the 

localisation/translation aspect - the flexibility of tagging can easily 

become a nightmare in terms of defining a namespace and terms for 

translating tags on the fly.

 

While I accept that tagging is certainly the way of the future for most 

publically accessible data, I think that if this taxonomy goes that way, 

it should work hand in hand at this point with a strict taxonomy of some 

sorts, even basic DC metadata,  until a stable distribution of  tags 

(gleaned from public tagging?) forms and a vocabulary for a namespace 

could be developed with some degree of accuracy.

 

My own experiences have shown that there is so much cross-over on 

datasets in terms of  category or business "ownership" (even within a 

single department/ministry) that its almost impossible to link data 

straight to government portfolio and business area unless its very 

specific data. Machinery of Government changes (the creation of new 

departments/ministries, or the seperation or mashup/restructure of 

existing ones) will also affect how data is categorised  if we're 

following Vassilios's quick and dirty method, even when adding "life 

event" or "business episodes". Different types of political systems also 

create different inherent structures and portfolio areas (eg: countries 

where law enforcement is the purview of the defence forces for instance).

 

Thought: Would it be worth doing something akin to, or in concert with, 

the proposal put forward in the "Group Call Tomorrow / Best Practice 

Publishing" thread of "seeing what's out there" - ie: a quick collation 

on how all governments at all levels are actually structured at the top 

level in terms of portfolios, departments, ministries etc and see what 

patterns, if any, appear? It may then give this discussion a good point 

of reference to start hacking on to come up with some sort of system 

that works - which I personally think needs to be done, and for more 

than just datasets - I think any public sector information on the 

internet could benefit in this regard.

 

Cheers

 

Chris Beer

Invited Expert

W3C e-Gov IG

 

Rastas Taru wrote:

> Hi,

> 

> I need to hop in to your good discussion. I'm ministerial adviser in the Ministry of Communications and co-operating with Antti regarding open public data issues here in Finland. The taxonomy aspect is indeed important. I would go with Vassilios idea that subject based grouping is probably the most useful from the citizen (life events and activities like housing, transport, public safety, work etc.) and business (services necessary in everyday business lifecycle) point of view. Example: is it the grouping used in Suomi.fi portal (www.suomi.fi/suomifi/english/index.html) or any other kind (many worldwide!). This way available services could be also added to be developed further: I just figured out that for example "open jobs"- on line service (www.mol.fi) is basically open API (?) but not accessible or developers probably don't know this. Antti's well thought mind map could be arranged in the life event too I guess or perhaps the issue goes further that some sort of general "cross-border" taxonomy could be useful from developers point of view? Anyhow administrative way of grouping is no good I think as for users it shouldn't matter. In the subject based grouping, at the best links to "shared services" can be found between different administrations (perhaps affecting even goverment's service mind)?

> 

> Regards,

> Taru

> 

> Taru Rastas

> Ministerial Adviser

> Media and Communications Services

> Ministry of Transport and Communications 

> 

> Tel: +358 9 160 28617

> Mob: +358 40 7155075

> taru.rastas@mintc.fi

> Fax: +358 9 16028588

> 

> Office: Eteläesplanadi 18, Helsinki

> P.Box 31, FI-00023 Government, Finland

> 

> 

> -----Alkuperäinen viesti-----

> Lähettäjä: public-egov-ig-request@w3.org [mailto:public-egov-ig-request@w3.org] Puolesta Peristeras, Vassilios

> Lähetetty: 29. lokakuuta 2009 19:37

> Vastaanottaja: Antti Poikola; Jonathan Gray

> Kopio: Li Ding; eGov IG

> Aihe: RE: generic list of public data sources

> 

> Hi Antti,

> 

> This is an interesting discussion.

> I see that you are not looking for data sets but for a taxonomy.  

> The quick (and dirty) way is to follow the administrative structure (more or less ministries). A good example is here [1] from FEA. 

> But then you have the same problems we experienced with the grouping of

> services: they can be found only if you are aware of the administrative structure. Several approaches tried to ameliorate this. The most common paradigm has been the "life-event" and "business episode" based service groupings. Can they be used for data? I wouldn't say so. 

> So the question is: Is there a better way to organize governmental data from what is presented in [1]-like approaches? Don't have an answer... 

> BTW, Jonathan's idea on using tags gives an interesting perspective. 

> 

> Regards,

> Vassilios

> 

> Taking the opportunity, the current issue of IEEE Intelligent Issue is on eGovernment. You may find it interesting [2].

> 

> [1] http://en.wikipedia.org/wiki/Business_reference_model

> [2] http://www.computer.org/portal/web/intelligent/home

> 

> 

> 

> 

> 

> -----Original Message-----

> From: public-egov-ig-request@w3.org

> [mailto:public-egov-ig-request@w3.org] On Behalf Of Antti Poikola

> Sent: 29 October 2009 18:34

> To: Jonathan Gray

> Cc: Li Ding; Antti Poikola; eGov IG

> Subject: Re: generic list of public data sources

> 

> Thanks Li, Owen and Jonathan

> 

> I'm well aware that there are several sites listing the actual more or 

> less open data sources like the data.gov and CKAN

> 

> I am looking a general topic list that would guide me that in my country

> 

> there are most propably some organization holding data about this, this 

> and this. Ofcourse I can compile the list by going trough the existing 

> data catalogues... The Owens detailed categorization was good for the 

> statistical data, but statistical data is just one branch in the owerall

> 

> picture... what about the register of "Alcholo licences in a city" or 

> something more weird but usefull.

> 

> Just to give you an idea i drafted out of my head a MindMap that I would

> 

> like to develope to cover the full picture.

> 

> http://mind42.com/pub/mindmap?mid=b84b44a0-4636-4de9-9a00-5a4513195ce2

> 

> All resource links are wellcome

> 

> BR,

> 

> -Antti

> 

> Jonathan Gray wrote:

>   

>> We've also got over 680 (mostly) open data packages listed on CKAN, an

>> open source registry of open data:

>> 

>>   http://ckan.net/

>> 

>> See, e.g.:

>> 

>>   * Linking Open Data group

>>     - http://ckan.net/group/lod

>>   * Packages as part of EU Open Data Inventory (alpha)

>>     - http://ckan.net/tag/read/eutransparency

>>   * Search for tags including 'country-[...]'

>>     -

>>     

> http://ckan.net/package/search?q=country-&search=Search+Packages+%C2%BB

>   

>> There are hopefully over 1000 UK government datasets on the way, as

>> data.gov.uk is using CKAN. Regarding categories, we've found a

>> flexible tag based approach quite useful.

>> 

>> It would be great to ensure interoperability between CKAN and other

>> open government data catalogues - so different bits of the 'open data

>> ecosystem' can all talk to each other! We've started talking to Peter

>> about this a bit regarding opengov.se.

>> 

>> Out of interest - would anyone be interested in having an online

>> meeting about this? E.g. next Tuesday (3rd November) evening at 1800

>> GMT?

>> 

>> Best wishes,

>> 

>>   

>>     

> 

> 

> 

> 

> 

> 

>   

 

 
Received on Sunday, 1 November 2009 18:44:55 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 1 November 2009 18:44:56 GMT