Re: Knowledge Graphs: the book and the hypermedia from Amirouche on 2021-11-19 (public-aikr@w3.org from November 2021)

From: Amirouche <amirouche@hyper.dev>
Date: Fri, 19 Nov 2021 10:57:48 +0000
To: "public-aikr@w3.org" <public-aikr@w3.org>
Message-ID: <cwW_mDhciHNvb0En6YsA7epQQL_FSJ1zfPwqjhQIeQ_JnJ9f9djaHJpWRiPJpg44kdmfTPEkyVwub9f>

# Knowledge Graphs

html: https://kgbook.org/#access
github: https://github.com/Knowledge-Graphs-Book/HTML-Book/

## Abstract

This book provides a comprehensive and accessible introduction to knowledge graphs,
which have recently garnered notable attention from both industry and academia.

Knowledge graphs are founded on the principle of applying a graph-based abstraction
to data, and are now broadly deployed in scenarios that require integrating and extracting
value from multiple, diverse sources of data at large scale. The book defines knowledge graphs
and provides a high-level overview of how they are used. It presents and contrasts popular
graph models that are commonly used to represent data as graphs, and the languages by which
they can be queried before describing how the resulting data graph can be enhanced with notions
of schema, identity, and context. The book discusses how ontologies and rules can be used
to encode knowledge as well as how inductive techniques — based on statistics, graph analytics,
machine learning, etc. — can be used to encode and extract knowledge. It covers techniques
for the creation, enrichment, assessment, and refinement of knowledge graphs and surveys
recent open and enterprise knowledge graphs and the industries or applications within which
they have been most widely adopted. The book closes by discussing the current limitations
and future directions along which knowledge graphs are likely to evolve.

This book is aimed at students, researchers, and practitioners who wish to learn more about
knowledge graphs and how they facilitate extracting value from diverse data at large scale.
To make the book accessible for newcomers, running examples and graphical notation are used
throughout. Formal definitions and extensive references are also provided for those who opt
to delve more deeply into specific topics.

## Table of Content

- Preface
- Acknowledgements

1. Introduction
2. Data Graphs
2.1. Models
2.2. Querying
3. Schema, Identity, Context
3.1. Schema
3.2. Identity
3.3. Context
4. Deductive Knowledge
4.1. Ontologies
4.2. Reasoning
5. Inductive Knowledge
5.1. Graph Analytics
5.2. Knowledge Graph Embeddings
5.3. Graph Neural Networks
5.4. Symbolic Learning
6. Creation and Enrichment
6.1. Human Collaboration
6.2. Text Sources
6.3. Markup Sources
6.4. Structured Sources
6.5. Schema/Ontology Creation
7. Quality Assessment
7.1. Accuracy
7.2. Coverage
7.3. Coherency
7.4. Succinctness
7.5. Other Quality Dimensions
8. Refinement
8.1. Completion
8.2. Correction
8.3. Other Refinement Tasks
9. Publication
9.1. Best Practices
9.2. Access Protocols
9.3. Usage Control
10. Knowledge Graphs in Practice
10.1. Open Knowledge Graphs
10.2. Enterprise Knowledge Graphs
11. Summary and Conclusion
- Bibliography
A. Background
A.1. Historical Perspective
A.2. “Knowledge Graphs”: Pre 2012
A.3. “Knowledge Graphs”: 2012 Onwards

## Summary and Conclusion

We have provided a comprehensive introduction to knowledge graphs, which have been receiving
more and more attention in recent years. Under the definition of a knowledge graph as a graph
of data intended to accumulate and convey knowledge of the real world, whose nodes represent
entities of interest and whose edges represent relations between these entities, we have discussed
models by which data can be structured as graphs; representations of schema, identity and context;
techniques for leveraging deductive and inductive knowledge; methods for the creation, enrichment,
quality assessment and refinement of knowledge graphs; principles and standards for publishing
knowledge graphs; and finally, we have discussed the adoption of both open and enterprise knowledge
graphs in the real world.

In this final chapter, we provide some concluding remarks, and further offer some insights on potential
future directions for research on knowledge graphs.

Concluding remarks. Knowledge graphs have garnered significant attention not only from diverse organisations
and industries, but also diverse research communities. This attention is due, in no small part, to the ubiquitous
nature of the problem that knowledge graphs address: integrating and extracting value from diverse sources
of data at large scale, be it in the context of a particular organisation, community, or more general collections
of human knowledge. The key insight of knowledge graphs is that graphs provide a simple, flexible, intuitive and
yet powerful abstraction for representing and integrating diverse data at large scale. This insight is far
from new (see Appendix A), but rather has finally come of age with the advent of knowledge graphs. Graphs have
long been used to represent data and knowledge in areas such as Graph Algorithms and Theory, Graph Databases,
Information Extraction, Knowledge Representation, Machine Learning, the Semantic Web, and more besides.
The advances in these areas can now be unified and applied for knowledge graphs.

Thus, the decision to model data as a graph opens up a “tool-box” of languages, techniques and systems
– stemming from diverse areas – that can be deployed in order to integrate and extract value from data
at large scale, as follows:

- A variety of graph query languages are now available that (unlike other NoSQL alternatives) are fully-featured,
supporting not only the relational algebra, but also novel features such as navigational queries that can match
paths of arbitrary length. A broad selection of graph databases and user interfaces supporting these query
languages are now also available.

- Though graphs do not depend on a detailed (relational-like) schema to represent data, various notions of graph
schemata have been proposed in order to validate, summarise and define the semantics of graphs.

- Contextual frameworks for graphs can be used to represent and reason about the scope of truth of knowledge in
the graph – relating to the time, space, provenance, confidence level, etc., for which something is held true
– including various alternatives for reification, annotated graph frameworks, etc.

- Deductive forms of reasoning can be enabled over graphs using ontologies and/or rules, which can not only encode
a machine-readable consensus about the meaning of the graph, but also provide automated access to implicit knowledge
entailed by a graph through materialisation or query rewriting.

- Graph algorithms, such as centrality measures, community detection, clustering, etc., can be applied on the data to
gain insights about influential entities or edges, close-knit sub-graphs of entities, and more besides, with graph
parallel frameworks capable of applying such algorithms at large scale.

- Recent and continual advances in knowledge graph embeddings and graph neural networks have now opened up new
possibilities for applying machine learning natively over graphs in the context of diverse tasks, including
classification, question answering, recommendations, and more besides.

- Rule and axiom mining techniques allow for extracting formal, declarative hypotheses from a knowledge graph
that encode high-level patterns and can be applied to derive new knowledge in a deductive, explainable manner.

- Graph-based information extraction can be applied to extract and/or enrich a knowledge graph from legacy sources
of text and semi-structured data, while graph-based mapping languages facilitate integrating diverse sources of
legacy structured data into the knowledge graph.

- Tools, techniques and methodologies for ontology engineering and ontology learning can further guide
the – potentially collaborative – creation of an ontology for the knowledge graph, encoding a consensus about
its semantics, and enabling access to implicit knowledge through deductive reasoning.

- Quality dimensions and metrics for knowledge graphs allow for systematically assessing the readiness of
the knowledge graph for its envisaged applications, in both a qualitative and quantitative manner,
where a variety of tools and frameworks are available to help perform such assessments.

- Knowledge graphs that have been integrated from diverse sources are likely to be incomplete, or to encode
incorrect data, where techniques and tools for knowledge graph refinement facilitate the automated completion
and correction of knowledge graphs, thus improving its overall quality and usefulness.

- For the purposes of publishing open knowledge graphs, principles & best practices and access protocols,
as well as techniques for linking, licensing, access & usage control, encryption and anonymisation,
can be leveraged to maximise their potential impact on society in an ethical way.

As we have discussed in Chapter 10, the various components of this “knowledge graph tool-box” can already
be found deployed in practice, having been applied – to varying degrees – in the context of numerous open
and enterprise knowledge graphs. As adoption of knowledge graphs continues, work will also continue
on improving and combining these tools, as well as on developing novel tools that help to better integrate
and extract value from diverse sources of data at large scale.

Future directions. Research on knowledge graphs involves a confluence of techniques from different research
areas with the common objective of maximising the knowledge – and thus value – that can be distilled
from diverse sources at large scale using a graph-based data abstraction [Hogan, 2020a].

In the intersection of data graphs and deductive knowledge, we emphasise emerging topics such as formal
semantics for property graphs, with languages that can take into account the meaning of labels and
property–value pairs on nodes and edges [Krötzsch et al., 2018]; and reasoning and querying over contextual
data, in order to derive conclusions and results valid in a particular setting
[Serafini and Homola, 2012, Zimmermann et al., 2012, Schuetz et al., 2021]. In the intersection of data graphs
and inductive knowledge, we highlight topics such as similarity-based query relaxation, allowing to find
approximate answers to exact queries based on numerical representations (e.g., embeddings)
[Wang et al., 2018]; shape induction, in order to learn and formalise inherent patterns in the knowledge graph
as constraints [Mihindukulasooriya et al., 2018]; and contextual knowledge graph embeddings that provide
numeric representations of nodes and edges that vary with time, place, etc. [Kazemi et al., 2019].
In the intersection of deductive and inductive knowledge, we mention the topics of entailment-aware knowledge graph
embeddings [Guo et al., 2016, Demeester et al., 2016], that incorporate rules and/or ontologies when computing
plausibility; expressive graph neural networks proven capable of complex classification analogous to expressive
ontology languages [Barceló et al., 2020]; as well as further advances on rule and axiom mining, allowing to extract
symbolic, deductive representations from the knowledge graphs [Galárraga et al., 2015, Bühmann et al., 2016].
Further challenges arise when considering the creation, enrichment, refinement and publication of knowledge graphs,
which call for further works on topics such as automated quality assessment (and repair), distantly-supervised extraction
frameworks, efficient access protocols, and anonymisation, to name but a few.

Aside from specific topics, more general challenges for knowledge graphs include scalability, particularly for deductive
and inductive reasoning; quality, not only in terms of data, but also the models induced from knowledge graphs; diversity,
such as managing contextual or multi-modal data; dynamicity, considering temporal or streaming data; and finally usability,
which is key to increasing adoption. Though techniques are continuously being proposed to address these challenges, they are
unlikely to ever be completely “solved”; rather they serve as dimensions along which knowledge graphs, and their techniques,
tools, etc., will continue to mature.

Given the availability of open knowledge graphs whose quality continues to improve, as well as the growing adoption of
enterprise knowledge graphs in various industries, future research on knowledge graphs has the potential to foster
key advancements in broad aspects of society. Here we have highlighted just some examples of future research directions
of importance to this pursuit.

Received on Friday, 19 November 2021 10:58:08 UTC