Re: Announcing syntreenet: A Python library to build scalable production rule systems from Enrique Pérez Arnaud on 2019-07-07 (semantic-web@w3.org from July 2019)

From: Enrique Pérez Arnaud <enrique@cazalla.net>
Date: Sun, 7 Jul 2019 21:07:47 +0200
To: Amirouche Boubekki <amirouche.boubekki@gmail.com>
Cc: Enrique Pérez Arnaud <eperez@emergya.com>, semantic-web <semantic-web@w3.org>
Message-ID: <20190707190444.4ybpippvk5tksrvg@eperez-ws.emergya.com>

Hi Amirouche,

On Sun, Jul 07, 2019 at 01:02:47PM +0200, Amirouche Boubekki wrote:
> 
> Hello Enrique,
> 
> Thanks for sharing!

Thanks for taking interest :)

> 
> Le sam. 6 juil. 2019 à 20:53, Enrique Pérez Arnaud <enrique@cazalla.net> a
> écrit :
> 
>     Hi all,
> 
>     Perhaps some of you might be interested in this library I've released, that
>     might be used to build production rule systems, with the peculiarity that
>     the cost of matching a fact to the knowledge base would be logarithmic in
>     the size of said knowledge base, measured as the number of rules plus the
>     number of facts in working memory. As far as I know, the current state of
>     the art is polynomial.
> 
>     https://gitlab.com/enriquepablo/syntreenet
> 
> 
> 
> I am interested in this work. Mind the fact that I am a newbie and I might ask
> trivial questions.
> 
> First, I can not tell the importance of this work compared to existing
> solutions
> even if what is described in the README seems promising.
> 
> Questions:
> 
> A) What about the name syntreenet?

"Network of syntagm trees". Just wanted a name that is unique in google.

> 
> B) What are Syntagm?

See below

> 
> C) I read the following:
> 
> 
>     "syntreenet knows nothing about syntax. Facts are a black box with whatever
>     internal structure that is appropriate for their universe of discourse.
>     syntreenet never composes a fact."
> 
> 
> What does syntax means in this context?

Facts are truths; facts are what we are interested in. And facts might
have any internal structure imaginable. They might be triples, for
example, as in RDF, or they might be sets of key value pairs, or nested
lists, or whatever. I call syntax to that internal structure; syntax is
grammar from a different perspective.

The point is that there is a universal transformation from facts built
with any imaginable grammar (so universal wrt the grammar), to
simple trees, and therefore to sets of tuples (the set of paths from
each leaf to the root). And those paths carry enough information about
the syntactic elements and their position within the facts so as to be used
to test whether some other fact matches them.

So, what are syntagms? Facts can have any internal structure, but there is
a level of detail which must leak to syntreenet, which is the level at
which the objects can be logical variables or not. Syntreenet must be
able to know whether a given syntactic element in a fact (in one of the
paths corresponding to it) is a variable or not.

In essence, the question is that in a deduction system we need to manage
2 things, facts as carriers of truth and variables as carriers of
generalization; and that the way in which the variables are a part of
the facts, which depends on the grammar used to buid the facts, is not
needed in the deduction procedure, even though within that procedure we
need to see the variables within the facts. Using paths is like a
universal (grammar independent) way of peeking within the facts to find
the variables; a way to isolate the algorithm to make deductions from
the internal structure of the deductions.

> 
> It seems to me for the purpose of the demonstration, it would be easier to
> avoid to be generic and stick to strings for facts and rules and have a
> convention for variables. What do you think?
> 
> Based on this log:
> 
> 
> adding rule "X1 isa X2; X2 is X3 -> X1 isa X3"
> adding rule "X1 is X2; X2 is X3 -> X1 is X3"
> adding fact "animal is thing"
> adding rule "X1 isa animal -> X1 isa thing"
> adding rule "thing is X3 -> animal is X3"
> adding rule "X1 is animal -> X1 is thing"
> 
> 
> It seems like the algorithm rely on pre-computing rules based on existing rules
> and facts, isn't it?

If you have:

  X1 isa X2; X2 is X3 -> X1 isa X3

And you have:

  animal is thing

then you necessarily have:

  X1 isa animal -> X1 isa thing

So I would not call this pre-computing, perhaps we might call it
book-keeping - thinking in the RETE algorithm, which in this situation
would not produce a new rule, but keep the match in a beta node. I would
just call it logic.

> It seems to me the inference process starts before the kb is actually asked
> something.

This is what production systems are about, in contrast to backward
chaining systems.

Best!

-- 
Enrique Pérez Arnaud

Received on Sunday, 7 July 2019 19:08:19 UTC