- From: Phil Archer <phila@w3.org>
- Date: Mon, 26 Nov 2012 10:38:29 +0000
- To: "eGov IG (Public)" <public-egov-ig@w3.org>
The minutes of today's meeting featuring presentations from Rufus
Pollock and Serafin Olcoz are at
http://www.w3.org/2012/11/26-egov-minutes.html and repeated below as text.
W3C eGov 26 November 2012 meeting
[2]Agenda
[2]
http://lists.w3.org/Archives/Public/public-egov-ig/2012Nov/0053.html
See also: [3]IRC log
[3] http://www.w3.org/2012/11/26-egov-irc
Attendees
Present
agipap, PhilA2, [IPcaller], +8532871aaaa, Serafin,
martinAlvarez, Tomasz, +8532871aabb, +8532871aacc,
+44.786.783.aadd, Elsa, Rufus
Chair
Tomasz Janowski
Scribe
Florian Henning, PhilA
Contents
* [4]Topics
1. [5]Open data kickoff
2. [6]presentation by open knowledge foundation (rufus
pollock)
3. [7]Serafin Olcoz on Openness and Reuse of Public
Sector Information using Open Data Publishing, Decree
* [8]Summary of Action Items
__________________________________________________________
<scribe> Scribe: Florian Henning
<scribe> ScribeNick: fhenning
<PhilA2> Meeting:eGov Interest Group
<PhilA2> agenda:
[9]http://lists.w3.org/Archives/Public/public-egov-ig/2012Nov/0
053.html
[9]
http://lists.w3.org/Archives/Public/public-egov-ig/2012Nov/0053.html
<PhilA2> chair: Tomasz
<DeirdreLee> I'm one of P5, P6 or P7..
<rgrp> hi there
<rgrp> this is rufus pollock ...
<rgrp> i am also on the phone
Open data kickoff
Tomasz: welcomes participants and ask for round of
introductions
<Olcoz> Hi Martin
[audio problems from Tomasz line. reconnecting]
[audio problems resolved]
yes i'm scribing
<PhilA2> scribe: fhenning
<PhilA2> scribeNick: fhenning
DeirdreLee: from from ireland
from unu-iist/merit
Gwyn_Sutherlin: phd candidate in peace studies
mariateresa: from england
martinAlvarez: from spain
<PhilA2> agipap: is Agis Panatoniou from NTUA Greece
Agis: from greece
<PhilA2> rgrp: Is Rufus Pollock
elsa: from unu-iist, macau
<billroberts> Hi All - from UK, particular interest in Linked
Data for public sector
Tomasz: few words about open data topic
<Gwyn_Sutherlin> I joined the call as IP caller, not sure how
to add
Tomasz: OD = data that is free to use/reuse by anyone
... OD has not yet beenproperly exploited, but can have huge
potential
... impossible to predict how it can produce value
<elsa> I am trying to connect on the phone and type 3468# and I
get the message that the code is incorrect
Tomasz: according to April's OD workshop by IDRC/Berkman
center, there is a range of potential benefis from OD
... but we don't understand many issues about OD
... same workshop also identified strateic tensions relating to
adoption of OD
... eg. contextual differences between developed and developing
countries
... also strat. tension concerning outcomes vs. impact
... another point is a strat. tnesion between qualitative and
quanitative methods to explore impact of OD
... it will be dificult to quantify many impacts. this impacts
how analysis is framed
... any questions at this point?
no questions
Tomasz: introduces speakers
... daniel bennet is not on the call. tomasz will fill in
<rgrp>
[10]http://notebook.okfn.org/2012/11/26/open-data-protocols-pre
sentation-to-w3c-egov-interest-group/
[10]
http://notebook.okfn.org/2012/11/26/open-data-protocols-presentation-to-w3c-egov-interest-group/
presentation by open knowledge foundation (rufus pollock)
<rgrp>
[11]http://notebook.okfn.org/2012/11/26/open-data-protocols-pre
sentation-to-w3c-egov-interest-group/
[11]
http://notebook.okfn.org/2012/11/26/open-data-protocols-presentation-to-w3c-egov-interest-group/
<rgrp> [12]http://bit.ly/dataprotocols-egov-nov-2012
[12] http://bit.ly/dataprotocols-egov-nov-2012
<PhilA2> Slides are now linked from the wiki
<rgrp> the notebook post would probably be the optimum thing to
link to
<rgrp> [13]http://ckan.org/
[13] http://ckan.org/
[audio line from rufus has suboptimal sound quality, scribing
will not be complete, pls refer to slides on wiki for more
complete notes]
<rgrp> [14]http://dataprocotols.org/
[14] http://dataprocotols.org/
phil could you aid with scribing for this presentation if you
receive better audio?
thanks
<DeirdreLee> [15]http://www.dataprotocols.org/
[15] http://www.dataprotocols.org/
<PhilA2> Slide 3 - we want a rich data ecosystm. Easy to share
data, easy to use
<PhilA2> scribe: PhilA2
<rgrp>
[16]http://blog.okfn.org/2011/03/31/building-the-open-data-ecos
ystem/
[16] http://blog.okfn.org/2011/03/31/building-the-open-data-ecosystem/
rgrp: We're missing quite a lot of this middle piece. We have
the top anad bottom but not the intermediatary group
... so data tends to be quite low quality
... To give you an example. If you;re trying to build something
on a hack day, you spend half your data cleaning up data
... assuming you can find it, it's not in the right form, got
messy terms etc.
... You need country codes that work in a mashup and so on.
... people spend time over and over again cleaning up the same
daya
... Rufus shows his age and starts talking about punch cards
... A classic thing you might want to do is garb data and put
in into postgre
... this is not a one liner. There's a lot to do. What we want
is a one line to get data from a catalogue and put it into a
local tool
<rgrp> [17]http://blog.okfn.org/2010/02/23/introducing-datapkg/
[17] http://blog.okfn.org/2010/02/23/introducing-datapkg/
<Tomasz> q
<Tomasz> phil, how to check the question queue? sorry
rgrp: A lot of software integration doesn't happen
automatically. A lot of it is based on APIs
<Tomasz> thank you
rgrp: We need to look at (digital) packaging
... We need that kind of software packaging ecosystem - how do
we do that with data?
... WE want to be better at automating getting data on nad off
our machines
... W3C does good work on schemas but it's not lightweight
... We've been doing a thing called dataprotocols.org where
people can hangout and work on specs
... Slide 5
... is a screen grab of what's going on
... these are concrete services that we have built or want to
build
... it's not a formal standardisation process
... but this is a space for more informal, RFC-style
development
... more on slide 6
... Going on to talk about data packages
... been working on it for about 5 years
... orginally part of CKAN
... we have software packages, can we have data packages?
... Focus on tabular data. The catalogues I've seen, tabular +
geo is almost all of what gets published
... In terms of original raw data, most of it is tabular
... It has a lot of attractive properties that I could go on
about
... A lot of the data is file based, not API-based
... flat files like CSV are very attractive. It may not be
pretty but it is effective - like a Kalashnikov rifle
... everything supports it
... it streams well, you can have massive files
... a simple schema for describing CSV would be useful
... we need version info for CSVs. Open is important, but if
eberyone is collaborating, how to we do version management for
CSV
... Git or Mercurial are potentially good ones for CSAV as they
are line-orientated
... the actual spec is available
... what it boils down to is a bunch of data files, you have
JSON and you can have other stuff
... there's a .json file that includes the metadata
... and then you have a filespec to list your files
... Slide 12 is an example
<rgrp>
[18]https://github.com/datasets/cofog/blob/master/datapackage.j
son
[18] https://github.com/datasets/cofog/blob/master/datapackage.json
rgrp: You can add more to this. But basically it's a table
schema
<rgrp>
[19]http://www.dataprotocols.org/en/latest/json-table-schema.ht
ml
[19] http://www.dataprotocols.org/en/latest/json-table-schema.html
The whole thing follows...
{
"metadata": {
"name": "cofog",
"title": "Classification of the Functions of Government",
"homepage":
"[20]http://unstats.un.org/unsd/class/family/family2.asp?Cl=4",
[20] http://unstats.un.org/unsd/class/family/family2.asp?Cl=4
"version": "1999",
"source": "United Nations",
"licenses": [
{
"id": "odc-pddl",
"name": "Open Data Commons Public Domain Dedication and Licence
(PDDL)",
"url": "[21]http://opendatacommons.org/licenses/pddl/"
[21] http://opendatacommons.org/licenses/pddl/
}
],
"description": "Classification of the Functions of Government
(COFOG) is a classification defined by the United Nations
Statistics Division. Its purpose is to \"classify the purpose
of transactions such as outlays on final consumption
expenditure, intermediate consumption, gross capital formation
and capital and current transfers, by general government\"
(from home page).",
"keywords": [
"Classification",
"COFOG",
"Finances",
"Government",
"United Nations"
]
},
"files": [
{
"path": "data/cofog.csv",
"fields": [
{
"id": "Code",
"type": "string"
},
{
"id": "Description",
"type": "string"
},
{
"id": "ExplanatoryNote",
"type": "string"
},
{
"id": "Change_date",
"type": "date"
}
]
}
]
}
rgrp: Interested to hear from the LD community in things like
JSON-LD
... Wrapping up... you could push this to a local Web site and
you have a data package
... Not concerned whether we adopt this kind of package or
something that does the same thing, but it's how to publish
data packages without a data catalogue
... Using JSON as your base schema language
... got to make something that can be used really usually
... Some people have suggested we drop JSON nad just use
another Excel worksheet to provide the data
... Every step to making it easy, brings more uses of the data
... We want to reduce the fritcion to getting, using and
sharing data
Tomasz: Thanks Rufus
<rgrp> I have finished
<rgrp> Any questions :-) ?
<Gwyn_Sutherlin> question
<fhenning> PhilA2: w3c is aware of process for schemas. there's
a workshop planned for next year on exactly the issues that
rufus has been addressing.
<scribe> scribe: fhenning
<PhilA2> Gwyn_Sutherlin: Do you do work around unstructured
data - text, audio, video etc?
<PhilA2> rgrp: Yes, we do, It's the Open Knowledge Foundation -
we're format agnostic
<rgrp> open knowledge includes content, data etc
<PhilA2> Gwyn_Sutherlin: Our cases are usually around
transparency and cirrpution
<PhilA2> s/cirruption/corruption
<rgrp> we do a lot around other topics including corruption :-)
eg. [22]http://okfnlabs.org/events/hackdays/lobbying.html
[22] http://okfnlabs.org/events/hackdays/lobbying.html
[sorry, the audio problems seem to be at our device . its'not
possible for me to do complete scribing at this point. could
you take over phil?]
<PhilA2> DeirdreLee: Thanks Rufus for the presentation. Do you
see things like DCAT as added overhead? What tools do you see
for packahing data?
<PhilA2> rgrp: That's my point. The spec allows you to build
the tool. We have a tool called DPM
<rgrp> [23]http://dpm.readthedocs.org/en/latest/
[23] http://dpm.readthedocs.org/en/latest/
<PhilA2> rgrp: DCAT is in some ways format agnostic but it's an
LD format. For an Excel user, you can tell them in 30" how to
export in CSV. There's no "export in .n3" option
<rgrp> metadata = ini file - xyz: abc
<PhilA2> rgrp: Maybe the metadata file should be a .ini file
for a simple example
<PhilA2> rgrp: Most formats are very simple, with JSON as the
most complex. Need things people can produce with the tools
they have
<PhilA2> Tomasz: Thanks Rufus again
<rgrp> see python, ruby, debs, nodejs etc - all have super
simple package formats ..
<rgrp> also you need a reason for people to package - you need
something they can then do ... (e.g. get something into
postgres in 30s)
Serafin Olcoz on Openness and Reuse of Public Sector Information
using Open Data Publishing, Decree
yes phil
<PhilA2> I love this line - The public sector is an archipelago
of competences and budgets
[no it drops in and out - better if you do it phil]
<PhilA2> scribe: PhilA2
Olcoz: We decided to share all our code
... Slides
[24]http://dl.dropbox.com/u/49911950/W3c%20-%20Open%20Assets.pd
f
... Slide 2 has the key policies
... Memorandum specifies the schema to use etc.
... makes publication of source code etc.
... If you want to develop software, you are obloged to see
what's already available and build on that
... you need to write a report on various aspects. What you're
using, what you're contributing back etc. (under EUPL licence)
... provide a functional description etc.
... Also state what dependencies there are etc.
... You are required to publish at least the dependencies as it
affects everyone, not just you
... The aim is to have a global idea of what is being done
using public money to develop software
... This is formalising the re-use process
[24] http://dl.dropbox.com/u/49911950/W3c%20-%20Open%20Assets.pdf
PhilA2: The obligation applies just to people being paid by the
public sector and not to third party developers?
Olcoz: Yes
... Private sector can take OSS and develop new products and
services based on the OSS directory
... They may then realise the advantage of this and can, if
they want, open their own source code
... Which we hope will create a virtuous cycle
... We're offering a robust service 24/7/365
... slide 5
... We are actively encouraging development.
... Supporting local enterprise and investment without having
to spend public sector money - an unimportant feature in the
current climate
... We'd like others to share out approach of course and would
welcome a European approach
... In order to allow people to use your OSS, you need to have
a portal to make it available and to be able to access other
repositories
... the repository itself is an asset that has value
... you can learn a lot about past and present components
... if somethinng is under development and you can wait for it
to be ready before you use it, then you know to wait, If you
can't wait, you know you need to go your own way.
... all the records are contained in an open data catalogue
... Slide 9
... We're agnostic about formats. It can be data, or text or
code etc.
... We need to be able to federate our repositories
... We defined various vocabularies, including for the re-use
process
... We have the support of CTIC and others
... also of ministry of finance in Spain, evaluating for use
across Spain
<Tomasz> we will reschedule Elsa's presentation for the next
meeting
Olcoz: talks about the schemas in use. Refers to RADion
([25]http://www.w3.org/ns/radion) and the ISA Programme that
created it
[25] http://www.w3.org/ns/radion)
<Tomasz> but i would still like some discussion about Serafin's
talk
Olcoz: Model can be used to link different sources of data from
the Web. Important to see repository itself as an asset -
needed to extend RADion
... We find problems with ADMS and DCAT. They don't cover
everything we need
... Three new portals launching in a couple of weeks' time. All
source code is open for re-use
Tomasz: Thanks very much Serafin - very interesting
Olcoz: If you need info about the decree - I've submitted links
to English resources to the IG
Tomasz: Can you give us a sense of the size of the Basque
government involvement?
Olcoz: We're still working on finishing the repositories so
we'll have to wait a few months to be able to report on
experience
Tomasz: Is design for re-use part of the requirement of the new
software project?
Olcoz: Not yet. That's the plan for the future
... People often saw design as being very specific to a use. In
the early 90s, people began to change that view. Now you see a
lot of re-usable software components
... We need to work on the guidelines around this
Tomasz: Any more questions?
Olcoz: I'd like to say I'm making a presentation on this to the
GLD WG this Thursday
... Wants to make a Member Submission
Tomasz: We're at the end of our time
... Apologies to Elsa for moving your presentation to our next
meeting next month
... A reminder that we have an open call for assistance with
developing the group's summary of the various presentations we
have received concerning social media
... Next few meetings will be on open data - and so will
welcome guests and ideas for speakers
<Gwyn_Sutherlin> thanks
Tomasz: Thanks to speakers and scribes
Received on Monday, 26 November 2012 10:38:55 UTC