W3C home > Mailing lists > Public > public-dwbp-wg@w3.org > March 2014

RE: Use Case: BetaNYC 3/5

From: Lee, Deirdre <Deirdre.Lee@deri.org>
Date: Fri, 7 Mar 2014 12:17:00 +0000
To: 'Public DWBP WG' <public-dwbp-wg@w3.org>
Message-ID: <DA0D3B2CE5F5614AA2307A974AFAC39BDD524514@UDSMBX01.uds.nuigalway.ie>
Ig, Makx +1

An issue that data on the web practitioners have is not necessarily that they do not know about certain standards, but that they don't know how best to use/implement them.

This is evident with the INSPIRE directive and with Ghislain's use-case about the ISO 19139 geo standard.

Many standards are designed in a way that they are open to interpretation and flexible, which is be a positive thing. However, through our use-cases, we have the opportunity to help identify situations where some more guidance/best-practice is needed and can be offered by our WG. For example, the recommendations that are included in DCAT as Makx outlines below.

While we are technology-agnostic (according to charter*), I would suggest we are not standard/vocabulary-agnostic.


*The BP document will will build on and extend the work done in the Government Linked Data Working Group (https://dvcs.w3.org/hg/gld/raw-file/default/bp/index.html ), taking a domain and technology-agnostic approach to cover aspects such as:

·         URI design and management for persistence;

·         use of core vocabularies to improve interoperability;

·         guidance on the provision of metadata;

·         publishing and accessing versions of datasets;

·         making controlled vocabularies accessible as URI sets;

·         technical factors for consideration when choosing data sets for publication;

·         technical factors affecting potential use of open data for innovation, efficiency and commercial exploitation;

·         data preservation.

From: Makx Dekkers [mailto:mail@makxdekkers.com]
Sent: 07 March 2014 11:27
To: 'Public DWBP WG'
Subject: RE: Use Case: BetaNYC 3/5

Ig, Steve,

(cutting out the pics to save space)

Part of this issue of trust/reliability/provenance on the Open Data Web is solved by the link between URIs and the organisation that is behind the URIs. In general, people are more likely to use URIs from organisations they trust; in many activities around Europe, people use URIs maintained by the Publications Office of the European Union in their Metadata Registry (http://publications.europa.eu/mdr/authority/). Another example is the recommendation in DCAT to use URIs for languages (http://www.w3.org/TR/vocab-dcat/#Property:dataset_language) maintained by the Library of Congress, the maintenance agency for ISO639-2. Those are organisations that a lot of people would trust.

However, I am not sure whether you can codify trust in absolute terms. It mostly depends who you are and what you're trying to achieve. As part of the Best Practice that we're working on, I think the best we can do is to outline the aspects or criteria someone may want to consider, such as who maintains the URIs, do they have a documented governance and change management process, do they have an explicit persistence policy, who else is using those URIs, how much does your application/system relies on the accuracy and 24/7 availability of the data etc.

Political issues too. Does a government agency in country A trust URIs maintained by a government agency in country B?


From: Ig Ibert Bittencourt [mailto:ig.ibert@gmail.com]
Sent: Friday, March 07, 2014 11:19 AM
To: Steven Adler
Cc: Public DWBP WG
Subject: Re: Use Case: BetaNYC 3/5

Hi Steve,

Thank you for sharing with us about these hackathons.

With regards the DBpedia data, although WayCount went one step further than Palo Alto about open data, I think the problem is the same. Perhaps they don't know about the vocabs and how to use them.

Don't you think we should create some use cases focused on the usage of PROV-O, QB, DCAT, ORG... ?


2014-03-06 12:51 GMT-03:00 Steven Adler <adler1@us.ibm.com<mailto:adler1@us.ibm.com>>:
Last night, I attended another BetaNYC Hackathon in Brooklyn, where I met another group of passionate citizens developing, and learning to develop, fascinating apps for Smarter Cities.  This week we were about 15 people in the room, and we started with a lightning round of "what are you working on" descriptions from project leads.  There were only three people in the room who had participated in the hackathon the week prior, and this is pretty normal.  BetaNYC has 1600 developers registered in their network and every week coders rotate in and out of meetups and projects in an endless and unplanned cycle that continuously inspires creativity and motivation by showcasing new projects.

The first project we heard about came from a local nonprofit called Tomorrow Lab<http://tomorrow-lab.com/>, who have designed hardware that measures how many bikes travel on streets they measure.  It uses simple hardware and open source software that connects two sensors with a pneumatic tube that measures impressions for weight and axel distance that differentiates between bikes and cars.  Its called WayCount.  The text below is from their website.  In the room we discussed how WayCount data could be combined with NYPD crash reports to more accurately identify the spots in NYC where bike accidents per bike numbers occur and identify ways to remediate.

WayCount is a platform for crowd-sourcing massive amounts of near real-time automobile and bicycle traffic data from a nodal network of inexpensive hardware devices.   For the first time ever, you can gather accurate volume, rate, and speed measurements of automobiles and bicycles, then easily upload and map the information to a central online database.  The WayCount device works like other traffic counters, but has two key differences: lower cost and open data. At 1/5th price of the least expensive comparible product, WayCount is affordable. The WayCount Data Uploader allows you to seamlessly upload and map your latest traffic count data, making it instantly available to anyone online.

Collectively, the WayCount user community has the potential to build a rich repository of traffic count data for bike paths, city alley ways, neighborhood streets, and busy boulevards from around the world. With a better understanding of automobile and bicycle ridership patterns, we can inform the design of better cities and towns.

The WayCount platform is an important addition to the process of measuring the impact of transportation design, and creating livable streets by adding bicycle lanes, public spaces, and developing smart transportation management systems. By creating open-data, we can increase governmental transparency, and provide constituencies with the essential data they need to advocate for rational and necessary improvements to the design, maintenance, and policy of transportation systems.

The hardware and software of the WayCount device and website were designed and engineered by Tomorrow Lab.

WayCount devices are currently for sale on the website, WayCount.com<http://waycount.com/>

We also discussed some ideas to provide policy makers with better sources of Open Data to guide policy discussions, and then broke up into four groups focusing on different projects.  One group discussed how to save the New York Library on 42nd Street from the imminent transformation of its main reading room and function as a lending library.  Another group scraped web pages for NYPD crash data for an app comparing accident rates across the 5 boroughs.  Some people just spent time talking about who they are and what they want to work on, what they want to learn, and how to get more involved.

I spent an hour with a young programmer who had worked on the NYC Property Tax Map I shared with you last week.  He showed me a Chrome Plugin he is working on that provides data about leading politicians whenever their names are mentioned on a webpage.  It is called Data Explorer for US Politics and it provides some nifty data on things like campaign contributions compared to committee assignments.

I asked him where he got his data and he showed me DBpedia<http://dbpedia.org/About>, which "is a crowd-sourced community effort to extract structured information from Wikipedia<http://wikipedia.org/> and make this information available on the Web. DBpedia allows you to ask sophisticated queries against Wikipedia, and to link the different data sets on the Web to Wikipedia data. We hope that this work will make it easier for the huge amount of information in Wikipedia to be used in some new interesting ways. Furthermore, it might inspire new mechanisms for navigating, linking, and improving the encyclopedia itself. "

Then I asked him how he knows that DBpedia data is accurate and reliable and he just looked at me.  "It's on the internet..."  Yeah, and so where weapons of mass destruction in Iraq.  But they were only on the internet and never in Iraq.  And herein lies a huge problem about Open Data on the Web; there is no corroboration of fact, no metadata describing where it came from, how it was derived, calculated, presented.  No one attests to its veracity, yet we all use it on faith which just ain't good enough.

This is why we have the W3C Data on the Web Best Practices Working Group<https://www.w3.org/2013/dwbp/wiki/Main_Page> - to create new vocabulary and metadata standards that attach citations and lineage, attestations and data quality metrics to Open Data so that everyone can understand where it came from, how much to trust it, and even how to improve it.

At the end of the evening, we also discussed IBM Smarter Cities, the Portland System Dynamics Demo, and the possibility of hosting a BetaNYC meetup at IBM on 590 Madison Avenue.  It was a fascinating evening and I encourage all to check out the links provided in this writeup and get out and join a meetup near you.

Talk to you tomorrow.

Best Regards,


Motto: "Do First, Think, Do it Again"


Ig Ibert Bittencourt
Professor Adjunto III - Universidade Federal de Alagoas (UFAL)
Vice-Coordenador da Comissão Especial de Informática na Educação
Líder do Centro de Excelência em Tecnologias Sociais
Co-fundador da Startup MeuTutor Soluções Educacionais LTDA.
Received on Friday, 7 March 2014 12:17:40 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:24:12 UTC