- From: Guha <guha@google.com>
- Date: Thu, 18 Oct 2018 14:20:15 -0700
- To: rdf-dev@w3.org, public-schemaorg@w3.org
- Cc: Dan Brickley <danbri@google.com>, Vicki Tardif Holland <vtardif@google.com>
- Message-ID: <CAPAGhv-BJKC0QjdsSPRWzN_sa3NTjiyEBCRmEW0DRizuiOw6OQ@mail.gmail.com>
In May 2018, we introduced datacommons.org, <http://datacommons.org/> an initiative for the open sharing of data, and released the first fact check corpus to help academia and practitioners to study misinformation. We are now taking the next step in the evolution of datacommons.org. Publicly available data from open sources (i.e. census.gov, NOAA, data.gov etc) are a vital resource for students and researchers in a variety of disciplines. Unfortunately, processing these datasets is often tedious and cumbersome. Organizations follow distinctive practices for codifying datasets. Combining data from different sources requires mapping common entities (city, county, etc) and resolving different types of keys/identifiers. This process is time consuming and can increase the likelihood for methodological errors. dataCommons attempts to synthesize a single Knowledge Graph from these different data sources. It links references to the same entities (such as cities, counties, organizations, etc.) across different datasets to nodes on the graph, so that users can access data about a particular entity aggregated from different sources. Like the Web, the dataCommons graph is open - any user can contribute datasets or build applications powered by the graph. In the long term, we hope the data contained within the dataCommons graph will be useful to students and researchers across different disciplines. Though we’ve already “jump-started” the graph with data from publicly available sources (Wikipedia, US Census, FBI, State election boards, etc), we encourage you to join and contribute. dataCommons is currently available to the academic community via Python Notebooks. You can use the dataCommons Knowledge Graph Browser <https://browser.datacommons.org/> to browse through the graph. The data can be programmatically accessed via APIs. Also, check out the tutorial/examples <https://datacommons.org/colab>. dataCommons is built on top of Schema.org schemas. When necessary, we have extended the vocabulary (see schema.datacommons.org). Vocabulary items that gain traction will be submitted for inclusion into Schema.org dataCommons is intended to be a community effort. Get involved <https://datacommons.org/getinvolved>! Guha and the dataCommons team
Received on Thursday, 18 October 2018 21:20:50 UTC