- From: Amirouche Boubekki <amirouche.boubekki@gmail.com>
- Date: Fri, 26 Jun 2020 20:27:49 +0200
- To: public-cogai@w3.org
Hello all! I am new to this mailing list. I take the time to present myself. I have been a professional web dev mostly with Python and JavaScript for 10 years in various domains unrelated to artificial intelligence. I have been dreaming about intelligent systems for much more than that. Ten years ago, I started seriously looking into artificial intelligence and figured that the most popular databases did not have all the features I needed. In particular, I was looking for a database that had strong guarantees and supported transactions that ruled out almost all NoSQL products. I tried property graph (neo4j, tinkerpop and orientdb), but they are still missing the ability to store and query diverse kinds of data. Long story made short, nowadays I mostly work with FoundationDB that provides transactions and horizontal scalability. I slowly but surely came to love triple stores. Last year, I have been working almost full time on optimizing the storage requirements for what I call the "Generic Tuple Store" that generalize the concept of triplestore and quadstore to any number of items. That is tuples can store a fixed number of items n where n is bigger than 3. Last week, I demonstrated that it was possible to store 10GB of triples with... 10GB of disk space (instead of 1 200 GB) with the added value that it keeps around the tuples' history and it is possible to do pattern matching. My primary interest is building a teaching assistant or research assistant. Toward that goal, I also have looked at recent developments in NLP, NLU, and text-mining. Other topics that interest me are legal tech, e-governance, and peer-to-peer. The later might seem unrelated, but it is very important. Recent computing development stress on the cloud and big infrastructures. One of my goals is to give back some power to lone-wolf hackers and independent researchers that necessarily work with little resources. The above result over 10GB can be applied to wikidata full dump, which means that it is possible, given 4 TB SSD disk, to store, given patience, and query wikidata on one laptop. For similar reasons, I am working mostly with the Scheme programming language [0]. I might drop the ball regarding scaling down and scaling up wikidata to focus on building a teaching assistant. Toward that particular goal, my next steps are the following: - Find a good solution to the problem of fuzzy string matching: https://stackoverflow.com/q/58065020/140837 - Given the previous item is done, implement the following algorithm that link entities against a KB: https://stackoverflow.com/a/58166648/140837 - Eventually, work on a Link Grammar clone that uses minisat library With that done, I would be able to make the computer understand things about texts in a way that is compatible with my constraints and goals. I love the work on multinet. I find the project opencog inspiring. I have lots of reading to catch on. I am very enthusiastic about this group and the years that will come :) [0] https://github.com/amirouche/arew-scheme#arew-scheme --- Amirouche ~ zig ~ https://github.com/amirouche/
Received on Friday, 26 June 2020 18:28:13 UTC