- From: Elwin Huaman <elwinlhq@gmail.com>
- Date: Mon, 19 Nov 2018 20:46:56 +0100
- To: public-schemaorg@w3.org, guha@google.com, danbri@google.com, support@datacommons.org
- Message-ID: <CABhN3myx0vaYOj0BaBoS+k=6RkfUa1BFw20HYh21jb1J8FnWVA@mail.gmail.com>
Hey all, I was challenged last week to provide info(in rough numbers) about the Data Commons Knowledge Graph(DCKG), which was constructed by synthesizing in a single Knowledge Graph from many different data sources[1]. What I am looking for especially is to know: - *How many entities or nodes the DCKG has?*, understanding that *dcid* (DataCommons identifier) is a unique identifier assigned to each entity in the knowledge graph, furthermore entities are represented by nodes[2]. - *How many data sources the DCKG has?*, because currently contains data from Wikipedia, the US Census, NOAA, FBI, *etc?*[3]. - *How many nodes and relations the DCKG has? and **How many statements it has?* - For example, the statement "Santa Clara County is contained in the State of California" is represented in the graph as two nodes: "Santa Clara County" and "California" with an edge labeled "containedInPlace" pointing from Santa Clara to California. - *What is the current size of the used vocabulary in the DCKG?*, taking into account that dataCommons.org builds upon on the vocabularies defined by Schema.org[4] - *These are potential FAQs* for future researchers (of course there are more) Could you help me? cheers, Elwin Huaman [1] https://browser.datacommons.org/ [2] https://colab.research.google.com/drive/1vffnWktZyffk7pNfpuXrTsCpp-od5W47 [3] https://datacommons.org/ [4] https://datacommons.org/faq
Received on Monday, 19 November 2018 19:47:28 UTC