Re: New KBpedia and KBAI

Thank you Mike for reply

yes - the big questions, because of an innate desire to tackle big problems
....

Regarding fact checking, well its necessary in reasoning -
Much of the facts upon which reasoning is based - including human
reasoning, not AI - be be
false, or arguable

We need to put fact checking in truth preservation and logic etc
P

On Thu, Jun 18, 2020 at 8:58 AM Mike Bergman <mike@mkbergman.com> wrote:

> Hi Paola,
>
> You always ask the big questions. ;) So, I will try to limit my response
> to big answers.
>
> As for Wikipedia, I know/suspect there is bias and falsity in some of the
> information. I see little of it directly, more in terms of errors of
> omission or viewpoint, rather than direct falsities. My suspicion is the
> actual percentage of unreliable information is quite low, though the
> information may still be incomplete. One point worth making has to do with
> so-called 'gold standards' that are essential to all science-based
> assessments, particularly with regard to human language or knowledge.
> Studies often see interannotator agreements in the 75-80% range, and only
> very widely used standards (like WordNet or various language corpora) get
> to agreements in the 90-95% range. This is an actual error term, so when
> one sees F1 stats or similar, perhaps of 80% or whatever, you need to
> decrement that amount by the interannotator agreement percentage. Many
> tests claiming 85-90% agreements for NLP are actually closer to 64% to 85%
> once we adjust for interannotator. Is 35% to 15% of information on
> Wikipedia bad??
>
> As a general matter, I am extremely leery of fact-checking services
> because what are the standards? who are the annotators? what is their
> interannotator agreement? These are science-based concerns, and I have
> ethical ones as well.
>
> As for the information in KBpedia, we tend to check most if not all of our
> links each release (which have averaged every 4-6 months or so). That is
> not perhaps frequent enough, but we also tend to tie into the more central
> or structural concepts in these external sources, rather than the leaves,
> which are more dynamic. The way KBpedia works is to tie into a key linkage
> point in an external source, and use that linkage point to retrieve current
> instances from that source. That is one reason why there are only 58 K
> concepts in KBpedia, but they tie into tens of millions of instances as
> maintained by the external sources.
>
> The reasoning we do is the traditional deductive ones (consistency,
> satisfiabiilty and subsumption) using reasoners like Pellet or HermiT, plus
> inductive reasoning that is based on various supervised machine learning
> approaches. We are not using abductive reasoning, but a reason for trying
> to follow the insights of Charles Peirce is that we have a means to get
> into that hypothesis-generating and -screening logic, which Peirce did more
> than anyone to explicate. It is an area I personally want to pursue further.
>
> The management of information follows a triple/quad store that handles the
> overall reasoning knowledge graph, with direct retrievals of instance data
> from the source knowledge bases (the seven specifically mentioned, plus
> another score of minor ones). Thus, KBpedia is not a massive, centralized
> system, but a rather lightweight one with distributed access and retrieval
> from its contributor sources. Of course, this kind of Web-oriented
> architecture with all resources identified by IRIs is one of the reasons
> semantic technologies make such great sense.
>
> Lastly, in terms of big useful lessons, I would point to the power of
> having "correct" KR distinctions between instances (individuals), types
> (generals or concepts), events, and attributes (monadic characteristics
> like color or shape) on the noun side. And, on the verb side, relations
> that split between attributes, direct relations and representations
> (indexes and denotations). Look at any top-level ontology or knowledge
> graph and ask yourself whether and how they handle these distinctions. Most
> do not or only hand wave. The distinctions that we use on these matters
> again come from the insights of Charles Sanders Peirce.
>
> Best, Mike
> On 6/16/2020 6:49 PM, Paola Di Maio wrote:
>
> Thank you Mike, looks like a big interesting project, congrats for the
> release
>
> Now, the problem I have with wikipedia is that in addition to containing
> good articles sometimes, it is not fact checked, there is a lot of
> rubbish/false information (true, there is quite a lot of rubbish outside of
> wikipedia too).
>
> A few of questions: how often is the data pulled/updated from these
> databases?  Is the data stored in sql or how? How does the system manage
> the integration of different data sets/data structures? can you share the
> design of the inference model/reasoning architecture? what are the
> implications/useful lessons for KR we can learn from this project?
>
> On Tue, Jun 16, 2020 at 10:27 PM Mike Bergman <mike@mkbergman.com> wrote:
>
>> To All,
>>
>> I am pleased to announce that we have released KBpedia
>> <http://kbpedia.org/> v 2.50 with e-commerce and logistics capabilities,
>> as well as significant other refinements. This upgrade comes from adding
>> the entire top structure and the most common products and services of the
>> United Nations Standard Products and Services Code. UNSPSC
>> <https://en.wikipedia.org/wiki/UNSPSC> is a comprehensive, multi-lingual
>> taxonomy for products and services, organized into four levels, with
>> third-party crosswalks to economic and demographic data sources. It is a
>> leading standard for many industrial and economic applications. UNSPSC is
>> KBpedia's seventh core knowledge base, joining the public knowledge bases
>> of Wikipedia <https://en.wikipedia.org/wiki/Wikipedia>, Wikidata
>> <https://en.wikipedia.org/wiki/Wikidata>, GeoNames
>> <https://en.wikipedia.org/wiki/GeoNames>, DBpedia
>> <https://en.wikipedia.org/wiki/DBpedia>, schema.org
>> <https://en.wikipedia.org/wiki/Schema.org>, and OpenCyc
>> <https://en.wikipedia.org/wiki/Cyc> already integrated into the system.
>>
>> KBpedia is a knowledge graph that provides a coherent scaffolding to
>> achieve its twin goals of data interoperability and knowledge-based
>> artificial intelligence (KBAI <http://www.mkbergman.com/category/kbai/>).
>> KBpedia now contains more than 58,000 reference concepts and nearly 200,000
>> unique mappings to its knowledge bases, enabling links to more than 40
>> million entities. It is written in the standard OWL 2
>> <https://en.wikipedia.org/wiki/Web_Ontology_Language> semantic language
>> from the W3C <https://en.wikipedia.org/wiki/World_Wide_Web_Consortium>.
>>
>> KBpedia consists of 73 mostly disjoint typologies organized under an
>> upper KBpedia Knowledge Ontology (KKO), which is designed according to the
>> universal categories and knowledge representation insights of the great
>> American 19th century scientist, logician, and polymath, Charles Sanders
>> Peirce <https://en.wikipedia.org/wiki/Charles_Sanders_Peirce>. KBpedia,
>> KKO, and all of its mappings and files are open source under the Creative
>> Commons Attribution 4.0 International (CC BY 4.0)
>> <https://creativecommons.org/licenses/by/4.0/> license.
>>
>> For more details, see the release announcement
>> <http://kbpedia.org/resources/news/kbpedia-adds-ecommerce/> or go to
>> Github <https://github.com/Cognonto/kbpedia/blob/master/versions/2.50/>
>> to download <http://kbpedia.org/resources/downloads/> the distro.
>>
>> Thanks, Mike
>>
>

Received on Saturday, 27 June 2020 03:35:57 UTC