W3C home > Mailing lists > Public > public-schemaorg@w3.org > November 2018

Re: dataCommons.org: Data Commons Knowledge Graph (DCKG)

From: Guha <guha@google.com>
Date: Mon, 19 Nov 2018 13:34:54 -0800
Message-ID: <CAPAGhv9pj5Fn5yQ0eM6j-AN3B6WQ-SbbzJ-Erhx3_ZJzqTisPg@mail.gmail.com>
To: elwinlhq@gmail.com
Cc: public-schemaorg@w3.org, Dan Brickley <danbri@google.com>, support@datacommons.org
Elwin,

 These numbers are rapidly changing and as we have all learnt, what matters
really is the utility of the data, not the size of the graph.

 So, could you give us some context?

guha

On Mon, Nov 19, 2018 at 11:47 AM Elwin Huaman <elwinlhq@gmail.com> wrote:

> Hey all,
>
> I was challenged last week to provide info(in rough numbers) about the
> Data Commons Knowledge Graph(DCKG), which was constructed by synthesizing
> in a single Knowledge Graph from many different data sources[1]. What I am
> looking for especially is to know:
>
>    - *How many entities or nodes the DCKG has?*, understanding that *dcid*
>    (DataCommons identifier) is a unique identifier assigned to each entity in
>    the knowledge graph, furthermore entities are represented by nodes[2].
>    - *How many data sources the DCKG has?*, because currently contains
>    data from Wikipedia, the US Census, NOAA, FBI, *etc?*[3].
>    - *How many nodes and relations the DCKG has? and  **How many
>    statements it has?*
>       - For example, the statement "Santa Clara County is contained in
>       the State of California" is represented in the graph as two nodes: "Santa
>       Clara County" and "California" with an edge labeled "containedInPlace"
>       pointing from Santa Clara to California.
>    - *What is the current size of the used vocabulary in the DCKG?*,
>    taking into account that dataCommons.org builds upon on the vocabularies
>    defined by Schema.org[4]
>    - *These are potential FAQs* for future researchers (of course there
>    are more)
>
> Could you help me?
>
> cheers,
> Elwin Huaman
>
>
> [1] https://browser.datacommons.org/
> [2]
> https://colab.research.google.com/drive/1vffnWktZyffk7pNfpuXrTsCpp-od5W47
> [3] https://datacommons.org/
> [4] https://datacommons.org/faq
>
>
Received on Monday, 19 November 2018 21:35:29 UTC

This archive was generated by hypermail 2.3.1 : Monday, 19 November 2018 21:35:30 UTC