Re: Trip Reports on Dagstuhl Seminar on Knowledge Graphs from Joshua Shinavier on 2019-08-28 (semantic-web@w3.org from August 2019)

From: Joshua Shinavier <joshsh@uber.com>
Date: Wed, 28 Aug 2019 07:14:23 -0700
To: Paola Di Maio <paoladimaio10@gmail.com>
Cc: Juan Sequeda <juanfederico@gmail.com>, Semantic Web <semantic-web@w3.org>
Message-ID: <CAPc0Ouu7P8whjcYKRhw+AvYz3jjWP2df_5SiX-pV-sVK_ZLWEQ@mail.gmail.com>
Hi Paola,

OK; I look forward to a more detailed argument in your article. So far, I
have only skimmed the paper you linked, but I can see that -- apart from
the fact that it is a little dated and does not mention currently popular
graph embedding techniques such as GraphSAGE (usual disclaimer: I am no
expert in embeddings) -- the criticism applies at best to one relatively
inessential and separable aspect of enterprise knowledge graphs. W.r.t.
information extraction, I can tell you from experience that dealing with
unreliable or incomplete data, while an inevitable fact of life, is not
necessarily a problem one should attempt to solve at the KG level. At least
9 times out of 10, the problem is better addressed at the level of
individual data sources, where the solutions are very domain-specific.

"Knowledge graph" may be a marketing term, but IMO it represents a shift
away from pure research and toward technologies that scale well and which
serve real-world needs, as Steffen mentioned. This is a good thing; it
means that KR is succeeding, even if it is doing so in unanticipated ways.
It is important to acknowledge the rise of lightweight KR (if I may use
that term) in the developer community via data models such as property
graphs which dispense with formal semantics altogether, and I think it is
also telling that many of the large-scale corporate knowledge graphs, at
their core, are not based on either RDF or property graphs, but on
special-purpose data models which have been designed in-house. I will tell
you about ours (Uber's) in a paper currently in internal review. Last week,
I had a chance to ask Xiao Ling (Apple) and Scott Meyer (LinkedIn) about
theirs. For Siri's knowledge base, Apple is using an RDF-like data model
(supporting "triples" with "qualifiers" that enable reification), but not
RDF proper. For the Economic Graph, LinkedIn is using a Datalog-based data
model which again is based on triples, but not on RDF or PG. This tells me
that the standards built for knowledge representation on the Web are being
used not so much for their associated formal properties, but as a means of
data interchange -- a point that was made, and which really stood out to me
in Paul Groth's trip report.

tl;dr plenty of things appear to have been said at the seminar which are
more actionable than much of the established theory around KR and SW. At
the same time, I believe there is tendency now to look back at SW and
earlier work and attempt to learn from it, adding more formality around
ontologies, inference, and rules where it makes sense to do so.

Josh



On Wed, Aug 28, 2019 at 12:18 AM Paola Di Maio <paoladimaio10@gmail.com>
wrote:

> Joshua
>
> thanks for the opportunity to clarify and apologies for the brashness
> of my remarks
>
> I did not mean that they KGs are not a type of KR, which arguably they are
>
> but they do not satisfy KR adequacy criteria in many ways (I ll address
> that more extensively
> in an article) and come with limitations, an example linked below
>
> The  lack of acknowledgment of such limitations is *startling *for me,
> and shows superficiality given that the workshop participants are leading
> researchers and colleagues, and include best of the sw researchers crop
> otherwise in many ways
>
>
> PDM
>
> this article explains some of the issues with KG, and especially using
> KGs as sole KR methods
>
> https://www.aclweb.org/anthology/D17-1184
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.aclweb.org_anthology_D17-2D1184&d=DwMFaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=yHrezOOUvTAeD_KgsElyJw&m=aNjZ2E21bTW1NHEQwPsqbJsQlCISkjiFHveUp3Qsp-U&s=TeWvt9PiUMH_e7fu6xP8vySKoOGki8BZFCsQWbp95SI&e=>
>
>   Unfortunately, information extraction approaches for KG construction
> must overcome complex, unreliable, and incomplete data. Many machine
> learning methods have been proposed to address the challenge of cleaning
> and completing KGs. One popular class of methods learn embeddings that
> translate entities and relationships into a latent subspace, then use this
> latent representation to derive additional, unobserved facts and score
> existing facts (Bordes et al., 2013; Wang et al., 2014; Lin et al., 2015)
>
>
>
> On Wed, Aug 28, 2019 at 2:26 PM Joshua Shinavier <joshsh@uber.com> wrote:
>
>> Maybe I need to read some of the past threads for context, but this
>> dismissive statement took me by surprise. In what way are KGs not KR? If
>> that were a true, it would deeply affect my own outlook and messaging. I
>> ought to at least try to understand your point of view. Are you referring
>> to some very limited and traditional definition of KR? Insofar as an RDF
>> statement is a claim about the world
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.w3.org_TR_rdf11-2Dconcepts_&d=DwMFaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=yHrezOOUvTAeD_KgsElyJw&m=aNjZ2E21bTW1NHEQwPsqbJsQlCISkjiFHveUp3Qsp-U&s=1ijuTw-9KTkWBdXnIoz2Hfg4v4uthQl0MBbr6mMEePs&e=>,
>> the humblest RDF graph is a representation of knowledge. So...
>>
>> My $0.02 is that KG is a particular, typically simple and pragmatic form
>> KR by a new name -- a pretty uncontroversial point of view, I would have
>> thought. Not looking for a debate, just clarification.
>>
>> FWIW, I was not involved in the Dagstuhl event, but really appreciated
>> the trip reports
>>
>> Josh
>>
>>
>>
>> On Tue, Aug 27, 2019 at 11:07 PM Paola Di Maio <paola.dimaio@gmail.com>
>> wrote:
>>
>>> Juan and all
>>>
>>> I finally got hold of the report, courtesy of Alex P
>>> /
>>> aic.ai.wu.ac.at/~polleres/publications/bona-etal-DagstuhlReport18371.pdf
>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__aic.ai.wu.ac.at_-7Epolleres_publications_bona-2Detal-2DDagstuhlReport18371.pdf&d=DwMFaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=yHrezOOUvTAeD_KgsElyJw&m=gaA1u5UYZsI_ZXB4pczTes7Z4Y5XsNf17VTvGW4NoQA&s=kzwa3xf1kft82oywOFTmr3190FCOd5k-5puzviUCFy8&e=>
>>>
>>> As a scholar in KR, I am concerned at the suggestion that KG are being
>>> proposed
>>> as KR,  and at the superficiality of the content of this report, and I
>>> am aggravated to note the complete lack of acknowledgement of  the
>>> limitations of this approach.
>>>
>>> Sounds like a good example of ineptitude, inadequacy and corruption
>>> heavily influencing academic research and the field of AI KR
>>>
>>> *two cents still allowed?
>>>
>>> PDM
>>>
>>>
>>>
>>> On Thu, Sep 20, 2018 at 6:41 AM Juan Sequeda <juanfederico@gmail.com>
>>> wrote:
>>>
>>>> Hi all,
>>>>
>>>> Last week there was a Dagstuhl seminar on: Knowledge Graphs: New
>>>> Directions for Knowledge Representation on the Semantic Web
>>>> https://www.dagstuhl.de/en/program/calendar/semhp/?semnr=18371
>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__www.dagstuhl.de_en_program_calendar_semhp_-3Fsemnr-3D18371&d=DwMFaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=yHrezOOUvTAeD_KgsElyJw&m=gaA1u5UYZsI_ZXB4pczTes7Z4Y5XsNf17VTvGW4NoQA&s=woJkjA7MzT9frcSHwr6o-5llrKuG9HDjHT-_mVaNkTQ&e=>
>>>>
>>>> A formal report will be coming out soon. For the mean time, some folks
>>>> have written their own reports. I'm sure folks in this community would be
>>>> interest:
>>>>
>>>> Eva Blomqvist:
>>>> http://blog.liu.se/semanticweb/2018/09/15/dagstuhl-seminar-on-knowledge-graphs/
>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__blog.liu.se_semanticweb_2018_09_15_dagstuhl-2Dseminar-2Don-2Dknowledge-2Dgraphs_&d=DwMFaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=yHrezOOUvTAeD_KgsElyJw&m=gaA1u5UYZsI_ZXB4pczTes7Z4Y5XsNf17VTvGW4NoQA&s=G69b8OTXXr2Zy497b6s0DYeIAvJdAhuromY8ZC7V8AY&e=>
>>>> Paul Groth:
>>>> https://thinklinks.wordpress.com/2018/09/18/trip-report-dagstuhl-seminar-on-knowledge-graphs/
>>>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__thinklinks.wordpress.com_2018_09_18_trip-2Dreport-2Ddagstuhl-2Dseminar-2Don-2Dknowledge-2Dgraphs_&d=DwMFaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=yHrezOOUvTAeD_KgsElyJw&m=gaA1u5UYZsI_ZXB4pczTes7Z4Y5XsNf17VTvGW4NoQA&s=R8dpWgBXbjHVDqM2etP3BiTZPTPGcwsF-VmotEHrLUw&e=>
>>>> Juan Sequeda:
>>>> http://www.juansequeda.com/blog/2018/09/18/trip-report-on-knowledge-graph-dagstuhl-seminar/
>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.juansequeda.com_blog_2018_09_18_trip-2Dreport-2Don-2Dknowledge-2Dgraph-2Ddagstuhl-2Dseminar_&d=DwMFaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=yHrezOOUvTAeD_KgsElyJw&m=gaA1u5UYZsI_ZXB4pczTes7Z4Y5XsNf17VTvGW4NoQA&s=6A-VzuGsMu0_Ey3Mp-TSXjUM4-p3MK85sjcaJZEpXzo&e=>
>>>>
>>>> Cheers
>>>>
>>>> Juan
>>>>
>>>> --
>>>> Juan Sequeda, Ph.D
>>>> www.juansequeda.com
>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.juansequeda.com&d=DwMFaQ&c=r2dcLCtU9q6n0vrtnDw9vg&r=yHrezOOUvTAeD_KgsElyJw&m=gaA1u5UYZsI_ZXB4pczTes7Z4Y5XsNf17VTvGW4NoQA&s=S2dSQ7Xed01N86mt8fYTovscWTGH6x-VYNyYknz6abo&e=>
>>>>
>>>
Received on Wednesday, 28 August 2019 14:14:59 UTC