Introduction from Amirouche Boubekki on 2020-06-26 (public-cogai@w3.org from June 2020)

From: Amirouche Boubekki <amirouche.boubekki@gmail.com>
Date: Fri, 26 Jun 2020 20:27:49 +0200
To: public-cogai@w3.org
Message-ID: <CAL7_Mo-PTzDOfm+1gvVFuNZuoQWDJW0q_Avo72t2fRFgUnsNhA@mail.gmail.com>
Hello all!


I am new to this mailing list. I take the time to present myself.

I have been a professional web dev mostly with Python and JavaScript
for 10 years in various domains unrelated to artificial intelligence.
I have been dreaming about intelligent systems for much more than
that.

Ten years ago, I started seriously looking into artificial
intelligence and figured that the most popular databases did not have
all the features I needed. In particular, I was looking for a database
that had strong guarantees and supported transactions that ruled out
almost all NoSQL products. I tried property graph (neo4j, tinkerpop
and orientdb), but they are still missing the ability to store and
query diverse kinds of data. Long story made short, nowadays I mostly
work with FoundationDB that provides transactions and horizontal
scalability. I slowly but surely came to love triple stores. Last
year, I have been working almost full time on optimizing the storage
requirements for what I call the "Generic Tuple Store" that generalize
the concept of triplestore and quadstore to any number of items. That
is tuples can store a fixed number of items n where n is bigger than
3. Last week, I demonstrated that it was possible to store 10GB of
triples with... 10GB of disk space (instead of 1 200 GB) with the
added value that it keeps around the tuples' history and it is
possible to do pattern matching.

My primary interest is building a teaching assistant or research
assistant. Toward that goal, I also have looked at recent developments
in NLP, NLU, and text-mining. Other topics that interest me are legal
tech, e-governance, and peer-to-peer.

The later might seem unrelated, but it is very important. Recent
computing development stress on the cloud and big infrastructures. One
of my goals is to give back some power to lone-wolf hackers and
independent researchers that necessarily work with little resources.
The above result over 10GB can be applied to wikidata full dump, which
means that it is possible, given 4 TB SSD disk, to store, given
patience, and query wikidata on one laptop.

For similar reasons, I am working mostly with the Scheme programming
language [0].

I might drop the ball regarding scaling down and scaling up wikidata
to focus on building a teaching assistant. Toward that particular
goal, my next steps are the following:

- Find a good solution to the problem of fuzzy string matching:
https://stackoverflow.com/q/58065020/140837

- Given the previous item is done, implement the following algorithm
that link entities against a KB:
https://stackoverflow.com/a/58166648/140837

- Eventually, work on a Link Grammar clone that uses minisat library

With that done, I would be able to make the computer understand things
about texts in a way that is compatible with my constraints and goals.

I love the work on multinet.

I find the project opencog inspiring.

I have lots of reading to catch on.

I am very enthusiastic about this group and the years that will come :)


[0] https://github.com/amirouche/arew-scheme#arew-scheme

---
Amirouche ~ zig ~ https://github.com/amirouche/
Received on Friday, 26 June 2020 18:28:13 UTC