- From: Elwin Huaman <elwinlhq@gmail.com>
- Date: Mon, 19 Nov 2018 20:46:56 +0100
- To: public-schemaorg@w3.org, guha@google.com, danbri@google.com, support@datacommons.org
- Message-ID: <CABhN3myx0vaYOj0BaBoS+k=6RkfUa1BFw20HYh21jb1J8FnWVA@mail.gmail.com>
Hey all,
I was challenged last week to provide info(in rough numbers) about the Data
Commons Knowledge Graph(DCKG), which was constructed by synthesizing in a
single Knowledge Graph from many different data sources[1]. What I am
looking for especially is to know:
- *How many entities or nodes the DCKG has?*, understanding that *dcid*
(DataCommons identifier) is a unique identifier assigned to each entity in
the knowledge graph, furthermore entities are represented by nodes[2].
- *How many data sources the DCKG has?*, because currently contains data
from Wikipedia, the US Census, NOAA, FBI, *etc?*[3].
- *How many nodes and relations the DCKG has? and **How many statements
it has?*
- For example, the statement "Santa Clara County is contained in the
State of California" is represented in the graph as two nodes:
"Santa Clara
County" and "California" with an edge labeled "containedInPlace" pointing
from Santa Clara to California.
- *What is the current size of the used vocabulary in the DCKG?*, taking
into account that dataCommons.org builds upon on the vocabularies defined
by Schema.org[4]
- *These are potential FAQs* for future researchers (of course there are
more)
Could you help me?
cheers,
Elwin Huaman
[1] https://browser.datacommons.org/
[2]
https://colab.research.google.com/drive/1vffnWktZyffk7pNfpuXrTsCpp-od5W47
[3] https://datacommons.org/
[4] https://datacommons.org/faq
Received on Monday, 19 November 2018 19:47:28 UTC