dataCommons.org

In May 2018, we introduced datacommons.org, <http://datacommons.org/> an
initiative for the open sharing of data, and released the first fact check
corpus to help academia and practitioners to study misinformation.

We are now taking the next step in the evolution of datacommons.org.

Publicly available data from open sources (i.e. census.gov, NOAA, data.gov etc)
are a vital resource for students and researchers in a variety of
disciplines. Unfortunately, processing these datasets is often tedious and
cumbersome. Organizations follow distinctive practices for codifying
datasets. Combining data from different sources requires mapping common
entities (city, county, etc) and resolving different types of
keys/identifiers. This process is time consuming and can increase the
likelihood for methodological errors.

dataCommons attempts to synthesize a single Knowledge Graph from these
different data sources. It links references to the same entities (such as
cities, counties, organizations, etc.) across different datasets to nodes
on the graph, so that users can access data about a particular entity
aggregated from different sources. Like the Web, the dataCommons graph is
open - any user can contribute datasets or build applications powered by
the graph. In the long term, we hope the data contained within the
dataCommons graph will be useful to students and researchers across
different disciplines. Though we’ve already “jump-started” the graph with
data from publicly available sources (Wikipedia, US Census, FBI, State
election boards, etc), we encourage you to join and contribute.

dataCommons is currently available to the academic community via Python
Notebooks. You can use the dataCommons Knowledge Graph Browser
<https://browser.datacommons.org/> to browse through the graph. The data
can be programmatically accessed via APIs. Also, check out the
tutorial/examples <https://datacommons.org/colab>.

dataCommons is built on top of Schema.org schemas. When necessary, we have
extended the vocabulary (see schema.datacommons.org). Vocabulary items that
gain traction will be submitted for inclusion into Schema.org

dataCommons is intended to be a community effort. Get involved
<https://datacommons.org/getinvolved>!

Guha and the dataCommons team

Received on Thursday, 18 October 2018 21:20:50 UTC