W3C home > Mailing lists > Public > public-dwbp-wg@w3.org > June 2014

Re: White House Roundtable on Open Data

From: Bernadette Farias Lóscio <bfl@cin.ufpe.br>
Date: Thu, 26 Jun 2014 11:48:11 -0300
Message-ID: <CANx1PzxnYb1B3g3jexKitNaWtMdQ40eW1_WVr5viQSQWJRkngw@mail.gmail.com>
To: Mark Harrison <mark.harrison@gs1.org>
Cc: Steven Adler <adler1@us.ibm.com>, Phil Archer <phila@w3.org>, public-dwbp-wg <public-dwbp-wg@w3.org>
Hi all,

I'd like to make some comments about Mark's comments:

1 & 2) I think your first two points are both related to data provenance
> and also the importance of expressing assumptions and explicit reference
> units (i.e. value per what?) in calculations.  Maybe we (DWBP) need to
> highlight some real existing examples of best practice use of W3C
> Provenance and provide a commentary about why/how that is considered best
> practice.  Guidelines can be very worthwhile, but sometimes complementing
> these with a few real-life existing best practice examples accompanied by a
> commentary can help people to get their data into better shape, if they
> have an annotated 'role model' example of what we think it should look
> like.  It can make it much easier for people to understand the guidelines
> if they see concrete examples.

+ 1 for the idea of providing examples to help people to understand the
guidelines. I think this should be done for every best practice. A best
practice may be implemented in different ways and examples should be
provided for each one of them.

> 3) Regarding visualisations, there are some very nice examples of
> visualisation mash-ups at http://publicdata.eu/ , some of which also
> include details about the source datasets and the frameworks that were
> used.  We should definitely encourage this - and encourage people to
> provide this kind of information to help others.  In a couple of talks I've
> given about Linked Open Data, I've shown some of these visualisations to
> explain what is possible.  One particular favourite of mine is
> http://apps.seme4.com/see-uk/ , which not only provides a very nice
> visualisation of various local socio-economic data (e.g. school provision,
> crime figures) - but also details about the underlying datasets and how it
> was constructed [ http://apps.seme4.com/see-uk/about.html ].  I'm sure
> that there are some other good examples we can highlight, to explain the
> benefits of mashing up open data sets to help people understand the data
> and make more meaningful comparisons.

I also agree that is important to provide information about how datasets
were constructed or combined. This kind of feedback is really important to
other consumers, but also to the data publiser, who may provide more
general information about the usage of a specific dataset or a set of
datasets. I think this kind of information may be described by the Data
Usage Vocabulary.

> 4) Regarding keeping data close to its source, that works but there needs
> to be co-ordination at a federal or ideally global level on the appropriate
> vocabularies and terms to be used, so that we can do meaningful comparisons
> across these datasets provided by local government.  At the W3C RDF
> Validation workshop last September in Boston, there was a paper along those
> lines (and Linked Data Profiles) by Paul Davidson (CIO Sedgemoor District
> Council, UK, Director of Standards for the UK Local eGovernment Standards
> Body (LeGSB) http://legsb.inetwork.org.uk ):
> http://www.w3.org/2001/sw/wiki/images/1/11/RDFVal_Davidson.pdf      I'm
> sure that Paul can explain the idea better than I can - but from what I
> understood, if each council or state is publishing essentially the same
> types of data, then it's a good thing if they publish the same kind of data
> in the same way, making use of the same vocabulary terms.  Because many
> local councils and US states might not have experts in Linked Open Data,
> it's helpful to develop Linked Data Profiles, to serve as a kind of
> 'template' for publishing specific kinds of data, so that we don't require
> every local council or state to figure it out for themselves.  Again, some
> real before-and-after worked examples (from raw data to best practice
> published data aligned with appropriate vocabularies) will be very helpful
> for many users - and complementary to just publishing guidelines or
> recommendations.

+1 to the use of common vocabularies! In my opinion the most important is
to have common vocabularies to describe data os specific domains. The data
itself may be published in distinct formats and tools may be used to
perform data transformations when necessary. However, the use of common
vocabularies is fundamental or there will be a lot of extra work to
identify mappings between distinct vocabularies. This is a classical data
integration problem :)

kind regards,

> 5) Well done for pointing out that data collected / generated at taxpayer
> expense should be made available, without seeking to charge for it a second
> time!
> Best wishes,
> - Mark
> On 24 Jun 2014, at 09:58, Steven Adler <adler1@us.ibm.com>
>  wrote:
> > 1. Will do. Please send the event details so I can put it in my calendar.
> > 2. Excellent. We need more cooperation between standards bodies and this
> is a terrific step forward.
> >
> > Regards,
> >
> > Steve
> >
> >
> > ----- Original Message -----
> > From: Phil Archer [phila@w3.org]
> > Sent: 06/23/2014 07:28 PM CET
> > To: Steven Adler; IBM Open Data Group <
> IBM_Open_Data_Group%IBMUS@us.ibm.com>; public-dwbp-wg <
> public-dwbp-wg@w3.org>; "betanyc-ibm-smartercities@googlegroups.com&gt
> &lt" <betanyc-ibm-smartercities@googlegroups.com>
> > Subject: Re: White House Roundtable on Open Data
> >
> >
> >
> > Forgive me Steve for only just now responding to this.
> >
> > First off, thank you for flying the W3C DWBP flag for us in this (self
> evidently) crucial forum. A couple of points that immediately come to mind:
> >
> > 1. Please be ready to talk about your model contracts event at the
> Share-PSI workshop in Lisbon, December 3-4th. I'm working on the CfP for
> that this week but the basics are as we've discussed - it's a workshop
> about encouraging more commercial use of public sector data and licensing
> is a key issue.
> >
> > 2. You highlighted GIS. I spent last week in a city I believe you'll
> know well, Aalborg (and I took a little time to go to Skagen too). That was
> with a geosaptial information systems crowd where, among other things, I
> was able to make more progress with the draft charter for the joint OGC/W3C
> Spatial data on the Web WG [1]. We need to dot the Is and cross the Ts on
> that yet but I'm hoping that the memberships of both standards bodies will
> be asked to formally approve the (by then finalised) charter next month.
> >
> > Spatial data comes up in open data a lot (we had more than a day
> discussing that crossover) so it's a timely discussion.
> >
> > Cheers for now
> >
> > Phil
> >
> > [1] http://www.w3.org/2014/05/geo-charter
> >
> > On 19/06/2014 16:34, Steven Adler wrote:
> >> Yesterday, I participated in an excellent Open Data Roundtable/Workshop
> at
> >> the White House Conference Center in Washington, DC.  The event was
> >> organized by the NYU GovLab OpenData500 and sponsored by the US Commerce
> >> Department and the White House Office of Science and Technology Policy.
> >>
> >>
> >>
> >> We were welcomed by Mark Doms, Under Secretary of Commerce for Economic
> >> Affairs who spoke about the importanance of Open Data is to the economic
> >> growth of the United States.   At a time, when the IMF and the Federal
> >> Reserve are forcasting low growth for several years, the US Government
> is
> >> looking at Open Data as a vast untapped resource to drive innovation and
> >> growth in the economy.  Commerce then presented Open Data plans from
> >> Census, the Bureau of Economic Affairs, and the US Patent Office.
> >>
> >> We had breakout discussions on six topics and I participated in great
> >> discussion on GIS data.  We were 9 at my table and we were each asked to
> >> identify key issues for Commerce Open Data.
> >>
> >>
> >>
> >>
> >> I laid out these issues:
> >>
> >> 1.  Data Comparability Standards: Open Data is published without
> >> describing how data is derived or calculated and that makes it difficult
> >> to compare factors and figures from even within the same agency.
> >>
> >> 2.  Data Governance Lineage:  The US Government should publish Open Data
> >> with metadata that describes the governance process behind the
> publication
> >> - where the data came from, how long the department had it, how it was
> >> processed, who signed off on publication.
> >>
> >> 3.  Display:  Open Data Catalogs are great for developers but the rest
> of
> >> the nation finds reading catalogs boring.  The Government needs to
> provide
> >> analytical tools that yield insights and make connections between
> >> datasets.  People can relate to data stories, charts, graphs, and
> >> visualization.
> >>
> >> 4.  Data Sources and Aggregation:  Much Federal Data comes from state
> and
> >> local repositories and is aggregated without source file metadata.  The
> >> Census Bureau, for instance, collects housing starts data from
> >> municipalities and aggregates that data to provide track-level reports
> for
> >> GIS Maps.  What would be better is if every municipality published their
> >> own housing starts data as Open Data and the Census Bureau provided URI
> >> Data Links to that data, creating massively federated Linked Data
> >> infrastructure that minimized errors and omissions by keeping all data
> at
> >> source.
> >>
> >> 5.  Private Public Partnerships... The government sees constrained
> budgets
> >> for many years to come and is looking for new revenue models to offset
> the
> >> costs of Open Data publishing.  There are many datasets that the
> >> government could publish that would be highly valuable (like NOAA
> >> hurricane forecasts) but are also expensive to publish and are outside
> >> agency missions and goals.  The government was looking for private
> >> enterprise to pay for this data.
> >>
> >> I told the government we are already paying for it through taxes and
> they
> >> should not seek revenue recognition for stimulating economic growth with
> >> Open Data.  It is a national resource that Commerce should provide for
> >> free to make American business more competitive.  Later I had a
> discussion
> >> with the CTO's from NIST, NAO, and Census in which we agreed that Open
> >> Data publishing would transform government IT by shifting investment
> from
> >> a do-it-all strategy that includes infrastructure, data, and application
> >> development to decreasing focus on applications and increasing focus on
> >> just publishing the data and letting private enterprise develop
> >> applications instead.
> >>
> >> 6.  Model Contracts:  I discussed the issue of license term confusion
> with
> >> White House Office of Science and Technology Policy and they are keen to
> >> co-sponsor an event with us in the fall to focus nationwide attention on
> >> the need for Model Contracts that make Open Data really "Open for
> >> Business."
> >>
> >> The day was closed out with some passionate speeches from Bruce Andrews,
> >> Acting Deputy Secretary of Commerce, and Penny Pritzker, Secretary of
> >> Commerce.  Bruce quoted Ginni in his comments, which was wonderful, and
> >> Penny said that Open Data is a key initiative of the Obama
> Administration
> >> and a centerpiece of her mission at the Commerce Department.
> >>
> >>
> >>
> >> A lot of the comments during the day gave one the impression that the US
> >> Government had just discovered that their data is an asset that could
> >> generate economic value.  I asked the Under Secretary if Commerce would
> be
> >> using its economists to calculate the economic value of their data, and
> he
> >> answered that they would be providing a report on that topic over the
> >> Summer.  I spoke to him about it later and he admitted that his
> economists
> >> have never thought about this issue before and don't have any great
> >> insights on how to calculate economic impact of data and that the goal
> of
> >> the report is just to get them to research how to do it.
> >>
> >> So this event was kind of a kickoff.  Commerce feels they have some
> >> valuable data and they want to be seen as leaders within the government.
> >> The Deputy Secretary told us that the President wants Open Data to be
> part
> >> of his legacy - a program that should be so successful at generating
> >> economic value for the nation that it MUST survive beyond his
> >> administration.
> >>
> >>
> >>
> >> Best Regards,
> >>
> >> Steve
> >>
> >> Motto: "Do First, Think, Do it Again"
> >>
> >
> > --
> >
> >
> > Phil Archer
> > W3C Data Activity Lead
> > http://www.w3.org/2013/data/
> >
> > http://philarcher.org
> > +44 (0)7887 767755
> > @philarcher1
> >
> >
> CONFIDENTIALITY / DISCLAIMER: The contents of this e-mail are
>  confidential and are not to be regarded as a contractual offer or
> acceptance from GS1 (registered in Belgium).
> If you are not the addressee, or if this has been copied or sent to you in
> error, you must not use data herein for any purpose, you must delete it,
> and should inform the sender.
> GS1 disclaims liability for accuracy or completeness, and opinions
> expressed are those of the author alone.
> GS1 may monitor communications.
> Third party rights acknowledged.
> (c) 2012.
> </a>

Bernadette Farias Lóscio
Centro de Informática
Universidade Federal de Pernambuco - UFPE, Brazil
Received on Thursday, 26 June 2014 14:49:00 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:24:14 UTC