W3C home > Mailing lists > Public > public-schemaorg@w3.org > November 2018

dataCommons.org: Data Commons Knowledge Graph (DCKG)

From: Elwin Huaman <elwinlhq@gmail.com>
Date: Mon, 19 Nov 2018 20:46:56 +0100
Message-ID: <CABhN3myx0vaYOj0BaBoS+k=6RkfUa1BFw20HYh21jb1J8FnWVA@mail.gmail.com>
To: public-schemaorg@w3.org, guha@google.com, danbri@google.com, support@datacommons.org
Hey all,

I was challenged last week to provide info(in rough numbers) about the Data
Commons Knowledge Graph(DCKG), which was constructed by synthesizing in a
single Knowledge Graph from many different data sources[1]. What I am
looking for especially is to know:

   - *How many entities or nodes the DCKG has?*, understanding that *dcid*
   (DataCommons identifier) is a unique identifier assigned to each entity in
   the knowledge graph, furthermore entities are represented by nodes[2].
   - *How many data sources the DCKG has?*, because currently contains data
   from Wikipedia, the US Census, NOAA, FBI, *etc?*[3].
   - *How many nodes and relations the DCKG has? and  **How many statements
   it has?*
      - For example, the statement "Santa Clara County is contained in the
      State of California" is represented in the graph as two nodes:
"Santa Clara
      County" and "California" with an edge labeled "containedInPlace" pointing
      from Santa Clara to California.
   - *What is the current size of the used vocabulary in the DCKG?*, taking
   into account that dataCommons.org builds upon on the vocabularies defined
   by Schema.org[4]
   - *These are potential FAQs* for future researchers (of course there are

Could you help me?

Elwin Huaman

[1] https://browser.datacommons.org/
[3] https://datacommons.org/
[4] https://datacommons.org/faq
Received on Monday, 19 November 2018 19:47:28 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:12:44 UTC