Re: What is a Knowledge Graph? CORRECTION from Chris Harding on 2019-06-16 (semantic-web@w3.org from June 2019)

From: Chris Harding <chris@lacibus.net>
Date: Sun, 16 Jun 2019 17:08:23 +0100
To: semantic-web <semantic-web@w3.org>
CC: Paola Di Maio <paoladimaio10@gmail.com>, xyzscy <1047571207@qq.com>
Message-ID: <5D066977.60203@lacibus.net>
Thanks, Pat, Bradwell, Simon, Dieter, Martynas, Marco, and Dave for your 
remarks!

I admit to using some terms rather loosely in my mail. Part of the 
difficulty that we have is that we don't yet have a well-understood 
theory to support use of words like "concept" and "meaning".

We are actually on more solid ground with "graph". Graph theory is 
established in mathematics. I don't claim to be an expert in that 
theory, but am trying to use the term "graph" in accordance with it, and 
am happy to defer to anyone who is an expert and feels that my use of 
the term is incorrect.

Let me try to express my thought without using "concept" or "meaning". 
There is a practice, currently growing in popularity, of creating a 
graph from a set of data (often including elements expressed in human 
language), and using that graph to derive another set of data, again 
often including language elements. The derived set of data is often 
intended to influence human decisions, perhaps by being presented as an 
analysis, perhaps by being presented as an explicit recommendation. A 
knowledge graph is a graph used in this way.

For example, I just visited amazon.co.uk and saw, "We think you'll enjoy 
'Becoming' by Michelle Obama". This may have been produced using a 
knowledge graph derived from data about my previous visits to the 
website. At any rate, this is the kind of thing that proponents of 
knowledge graphs claim to be able to do.

I don't believe there is a single accepted way of creating a knowledge 
graph or of deriving data from a knowledge graph. Rather, there is a 
toolkit from which users of knowledge graphs ("knowledge engineers"?) 
can select the appropriate tools for particular purposes. Taking nouns 
as nodes and verbs as edges is certainly one approach. It can be made 
more sophisticated by using thesauri to create other edges between 
nouns. Another approach that I think is used is to ignore the 
distinction between nouns and verbs and put an edge between two word 
nodes if they occur in the same sentence. Nodes and edges can be given 
attributes. For example, an edge between two words might have an 
attribute that indicates the context in which they are considered similar.

These graphs can be represented using triples, and I believe that this 
is the way the current tools often do represent them. The tools may 
include some processing based on the propositional or predicate 
calculus. Perhaps they are less sophisticated in this respect than the 
semantic nets of the 70s. They may also include processing that uses 
methods such as statistical analysis and trained neural networks, and I 
stand to be corrected but don't believe these were commonly used in the 
earlier work.

So it looks as though these tools do, in Dave's words, blend symbolic 
and statistical approaches, and they can also use machine learning. 
Whether  "knowledge graph" is the paradigm that he is looking for, I'm 
not sure.

Patrick J Hayes wrote:
> The idea of representing, or at least displaying, knowledge as a 
> graphical diagram (rather then as, say, a set of sentences) has a very 
> old history. In its modern sense it goes back at least to 1885 (C S 
> Peirce “existential graph") and can probably be traced into medieval 
> writings and earlier. (The Torah version of Genesis refers to a "tree 
> of knowledge".) It has been re-invented or rediscovered many times 
> since, and seems to blossom in public (or at least academic) 
> discussions with a periodicity of roughly 40 years.
>
> The appeal of this idea seems to lie, in part, in the way that it 
> makes vivid the insight that all knowledge is ‘connected’, in a way 
> that thinking of knowledge as made up of separate sentences or 
> ‘propositions’ fails to acknowledge. In the 1970-1980 revival 
> (surrounding the term ‘semantic net’) there was also a widespread 
> notion that graphs or networks were more inherently ‘graphical’ or 
> ‘diagrammatic’ in nature (as contrasted with the purely ’symbolic’ 
> nature of sentences), so an echo of the left/right-brain idea was 
> added to the intellectual soup. Fortunately, this has now been largely 
> forgotten.
>
> None of this makes any actual sense, of course, since the connectivity 
> of the graph or network happens entirely through the co-occurrence of 
> names in the various sentences that make up the parts of the ‘graph’. 
> So a set of sentences (in RDF and current systems, very simple 
> sentences comprising a single triple) is inherently ‘connected’ via 
> the fact that sentences use the same names as other sentences. But as 
> this is the only kind of connection that the graph/network notations 
> can encode, the graph ’structure’ does not add anything at all to the 
> expressiveness of the notation. It is simply a decorative way to write 
> a bunch of sentences on a surface.
>
> The RDF standard acknowledged this insight by /defining/ an “RDF 
> graph” to simply be a set of RDF triples, thereby keeping the 
> ‘diagram’ terminology while allowing implementations to happily ignore 
> it and use whatever storage and display techniques they like for large 
> ‘graphs’ (typically, hash stores using quads). So “graph” now, in the 
> post-RDF usage (which includes the term ‘knowledge graph’, which 
> simply refers to Google’s way of using RDF without having to strictly 
> conform to the RDF specifications) has come full circle to not 
> actually mean a graphical diagram or a network, but simply as a handy 
> word to refer to a chunk of structured knowledge, represented as 
> triples. It basically restricts the form of sentences, nothing more.
>
> Hope this helps.
>
> Pat
>
>
>
>> On Jun 15, 2019, at 11:06 AM, Bradwell (US), Prachant 
>> <prachant.bradwell@boeing.com <mailto:prachant.bradwell@boeing.com>> 
>> wrote:
>>
>> Through this conversation, it seems to me that the term “graph” is a 
>> confusion point. Might there be a better term to explain this to the 
>> layman?
>>
>> It is entirely possible that I need a history lesson on this too :)
>>
>> Sent from my iPhone
>>
>> On Jun 15, 2019, at 11:02 AM, Patrick J Hayes <phayes@ihmc.us 
>> <mailto:phayes@ihmc.us>> wrote:
>>
>>> Chris,a few remarks.
>>>
>>> 1. Although obviously a node-edge-node is a triple, so any 
>>> (directed, labeled) graph can be treated as a set of triples, not 
>>> all sets of triples can be drawn as a graphical diagram. RDF graphs 
>>> (= sets of RF triples, by definition) for example can have the same 
>>> label used as both a node and arc label, possibly even in the same 
>>> triple. I would suggest treating the word “graph” here as a handy 
>>> way to describe triple-sets and leave it at that.
>>>
>>> 2. Being ’thought of as’ something can hardly be used as a 
>>> definition. I can think of a pile of grey rags as an elephant, but 
>>> that doesnt make it actually be anything.
>>>
>>> 3. To speak of ‘concepts' and 'nodes representing' them is getting 
>>> very blurry indeed with semantics, to the point where one loses 
>>> meaning altogether. Most nodes in most K. graphs do what names in 
>>> Krep notations usually do: they /denote/ /things/ (‘entities’ if you 
>>> like). After working in the semantcs area for most of my career, I 
>>> have no clear idea what ‘concepts’ are, but if the concept of, say, 
>>> Paris is anything other than a city, then a node with the label 
>>> “Paris”, intended to name the capital of France, does NOT represent 
>>> a concept.
>>>
>>> 4. The notion of higher-dimensional triple is new. (Did you mean 
>>> ‘higher-order’?) And can you illustrate this technique of 
>>> real-valued vectors to encode them?
>>>
>>> 5. The semantic nets of the 1970s were, almost univerally, /much/ 
>>> more expressive than knowledge graphs or RDF, or any of the other 
>>> ‘graph’-like modern notations. They typically had ways of encoding 
>>> quantifier scopes, disjunction, negation and sometimes such things 
>>> as modal operators. The grandfather of them all, C.S.Peirce’s 
>>> ‘existential graphs’  had the full expressivity of first-order logic 
>>> in 1885 (implemented as ‘conceptual graphs’ by John Sowa about 90 
>>> years later http://www.jfsowa.com/cg/cgonto.htm). It has been 
>>> downhill from there.
>>>
>>> Pat Hayes
>>>
>>>> On Jun 14, 2019, at 1:31 PM, Chris Harding <chris@lacibus.net 
>>>> <mailto:chris@lacibus.net>> wrote:
>>>>
>>>> Hi, Paola -
>>>>
>>>> Interesting question! I think that graphs relate particularly to 
>>>> triples because node-edge-node can be represented as a triple, so a 
>>>> collection of triples describes a graph.
>>>>
>>>> So "a collection of triples to which someone attaches meaning" 
>>>> doesn't quite capture it. Maybe "a collection of triples to which 
>>>> someone attaches meaning and which is thought of as a graph, with 
>>>> the nodes representing concepts and the edges representing 
>>>> meaningful connections between them" would come closer?
>>>>
>>>> Higher-dimension tuples can come in as embedded vectors - tuples of 
>>>> real numbers that cam be associated with nodes or edges of the 
>>>> knowledge graph to convey attribute values. There appear to be 
>>>> various techniques for producing these, including AI.I think it is 
>>>> these techniques that take us beyond "the good old semantic nets of 
>>>> the 70ies" - although scale is important too.
>>>>
>>>> Paola Di Maio wrote:
>>>>> Chris
>>>>> KG can also be any n-tuple, isnt it?
>>>>>
>>>>> On Thu, Jun 13, 2019 at 6:21 PM Chris Harding <chris@lacibus.net 
>>>>> <mailto:chris@lacibus.net>> wrote:
>>>>>
>>>>>     I should have said that it is a collection of triples to which
>>>>>     someone attaches meaning. The triples might or might not be in
>>>>>     a triple store.
>>>>>
>>>>>     Chris Harding wrote:
>>>>>>     What is a knowledge graph?
>>>>>>
>>>>>>     I looked it up in Wikipedia, and the definition seemed to be
>>>>>>     "What Google does". Reading a bit more widely, I came to the
>>>>>>     conclusion that it is a triple store to which someone
>>>>>>     attaches meaning. (Of course, this is most, if not all,
>>>>>>     triple stores.) What is interesting is the impressive amount
>>>>>>     of theory and practice, associated with the "knowledge graph"
>>>>>>     label, for using AI and other techniques to obtain
>>>>>>     transformations or measurements of the triple stores that add
>>>>>>     to the meaning that people attach to them.
>>>>>>
>>>>>>     I found these articles helpful:
>>>>>>     http://ceur-ws.org/Vol-2322/dsi4-6.pdf
>>>>>>     https://towardsdatascience.com/neural-network-embeddings-explained-4d028e6f0526
>>>>>>     https://content.iospress.com/articles/data-science/ds007
>>>>>>
>>>>>>     xyzscy wrote:
>>>>>>>     Thank you for your response. I think the KG term is spread
>>>>>>>     by GOOGLE, while I don’t how google implement it.  I used to
>>>>>>>     think the semantic network  is the key technology of KG,but
>>>>>>>     google has never statement that.
>>>>>>>>     在 2019年6月13日，下午2:46，Paola Di Maio
>>>>>>>>     <paola.dimaio@gmail.com <mailto:paola.dimaio@gmail.com>> 写道：
>>>>>>>>
>>>>>>>>     Thank you for asking this,
>>>>>>>>
>>>>>>>>     I  ll leave the experts to reply to scalability and other
>>>>>>>>     questions
>>>>>>>>
>>>>>>>>     In general, much depends on the language one uses, which in
>>>>>>>>     turn
>>>>>>>>     depends on the domain (which planet you come from)
>>>>>>>>
>>>>>>>>     When I first studied knowledge engineering, the expression
>>>>>>>>     knowledge graph
>>>>>>>>     was not in use at all. I was doing an MSc and studied the
>>>>>>>>     body of knowledge
>>>>>>>>     from ESPRIT project (some folks on this list worked on it)
>>>>>>>>     https://pdfs.semanticscholar.org/193e/b66909b0c87d5dbcdbd6b20d78ed93fc95a7.pdf
>>>>>>>>
>>>>>>>>
>>>>>>>>      I d be curious to learn when such term knowledge graph
>>>>>>>>     came in use and who coined it
>>>>>>>>
>>>>>>>>     I then heard it in relation to the SW and this list, and
>>>>>>>>     always tried to figure out what exactly
>>>>>>>>     a KG is (in relation the wider Knowledge Representation
>>>>>>>>     domain I was studying)
>>>>>>>>
>>>>>>>>     Knowledge graphs are a type of knowledge representation,
>>>>>>>>     and they can be visualized
>>>>>>>>     graphically, or represented using algebra (again, depends
>>>>>>>>     on what planet you are on)
>>>>>>>>     Engineers tend to use diagrams, others tend to use algebra
>>>>>>>>
>>>>>>>>     But more importantly, is that they enable machine
>>>>>>>>     readability querying and computational manipulation of
>>>>>>>>     complex (combined) data sets, assuming knowledge is some
>>>>>>>>     kind of data in context, as some say.
>>>>>>>>     I dont use the term knowledge graph much either.  Let's see
>>>>>>>>     if the KG folks can offer more info
>>>>>>>>
>>>>>>>>     PDM
>>>>>>>>     Knowledge Graph Representation
>>>>>>>>     *Knowledge graphs* provide a unified format for
>>>>>>>>     representing *knowledge* about relationships between
>>>>>>>>     entities. A *knowledge graph* is a collection of triples,
>>>>>>>>     with each triple (h,t,r) denoting the fact that relation r
>>>>>>>>     exists between head entity h and tail en- tity t.
>>>>>>>>     http://ceur-ws.org/Vol-2322/dsi4-6.pdf
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>     On Thu, Jun 13, 2019 at 1:40 PM 我 <1047571207@qq.com
>>>>>>>>     <mailto:1047571207@qq.com>> wrote:
>>>>>>>>
>>>>>>>>         Dear all:
>>>>>>>>
>>>>>>>>         When I first touch knowledge graph, I'm very confused.
>>>>>>>>         Different from the other AI theory,  it is not an
>>>>>>>>         pattern recognization algorithm which will  give some
>>>>>>>>         "output" given some "input"(such as classify
>>>>>>>>         algorithms) ,but a program language(such as owl,rdf)
>>>>>>>>         and database(such as neo4j) instead. So in my opinion,
>>>>>>>>         knowledge graph is more like a problem of
>>>>>>>>         engineering than mathematic theory.
>>>>>>>>
>>>>>>>>         Then I realized that different from the pattern
>>>>>>>>         recognization algorithm, the knowledge graph is created
>>>>>>>>         aimed at making the computes all over the world to
>>>>>>>>         communicate with each other with a common language, and
>>>>>>>>         I have a question: Is scalability the key property of
>>>>>>>>         knowledge graph?
>>>>>>>>
>>>>>>>>         There are many knowledge vaults edited by different
>>>>>>>>         language(such as owl,rdf ),but is it always hard to
>>>>>>>>         merge them and there is not a standard knowledge vault 
>>>>>>>>         on which  we can do advanced  development. So is it
>>>>>>>>         necessary to open a scalable  and standard knowledge
>>>>>>>>         vault so that everyone can keep extended it and make it
>>>>>>>>         more perfect just like linux kernel or  wiki pedia?
>>>>>>>>         What kind of knowledge should be contained in the
>>>>>>>>         standard knowledge vault so that it can be universal? 
>>>>>>>>         I imagine that the standard knowledge vault is an
>>>>>>>>         originator, and all of the other application copy the
>>>>>>>>         originator, then all of the other application can
>>>>>>>>         communicate under the same common sense, for example
>>>>>>>>         when a application decelerate ''night", all of the
>>>>>>>>         other application will know it's dark.
>>>>>>>>
>>>>>>>>         As I know, the knowlege graph is implement as a query
>>>>>>>>         service, but is it possible to implement it  as a
>>>>>>>>         program language,just like c++,java? In this way ,the
>>>>>>>>         compute can directly know nature language, and human
>>>>>>>>         can communicate with compute with nature language, also
>>>>>>>>         a compute can communicate with another compute with
>>>>>>>>         nature language.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>>     -- 
>>>>>>     Regards
>>>>>>
>>>>>>     Chris
>>>>>>     ++++
>>>>>>
>>>>>>     Chief Executive, Lacibus <https://lacibus.net/> Ltd
>>>>>>     chris@lacibus.net <mailto:chris@lacibus.net>
>>>>>>
>>>>>
>>>>>     -- 
>>>>>     Regards
>>>>>
>>>>>     Chris
>>>>>     ++++
>>>>>
>>>>>     Chief Executive, Lacibus <https://lacibus.net/> Ltd
>>>>>     chris@lacibus.net <mailto:chris@lacibus.net>
>>>>>
>>>>
>>>> -- 
>>>> Regards
>>>>
>>>> Chris
>>>> ++++
>>>>
>>>> Chief Executive, Lacibus <https://lacibus.net/> Ltd
>>>> chris@lacibus.net
>>>>
>>>
>

-- 
Regards

Chris
++++

Chief Executive, Lacibus <https://lacibus.net> Ltd
chris@lacibus.net
Received on Sunday, 16 June 2019 16:08:54 UTC