W3C home > Mailing lists > Public > semantic-web@w3.org > May 2020

Re: Knowledge graph toolkit

From: Diego Torres <diego.torres@lifia.info.unlp.edu.ar>
Date: Sat, 2 May 2020 11:13:03 -0300
Message-Id: <8D96CFCD-953B-4149-B048-FE48D1903441@lifia.info.unlp.edu.ar>
Cc: Semantic Web <semantic-web@w3.org>
To: Amirouche Boubekki <amirouche.boubekki@gmail.com>
Thanks all ,

The feedback was interesting. Apparently it need a lot of learning and researching taking into account some different points of view. 

I will continue learning and open to receive more suggestions. 

Diego

Ps sorry for the links in my signature. I am fixing them. 

Enviado desde mi móvil. Disculpe la brevedad. 

> El 2 may. 2020, a la(s) 05:30, Amirouche Boubekki <amirouche.boubekki@gmail.com> escribió:
> 
> Hello Diego!
> 
>> Le ven. 1 mai 2020 à 18:38, Diego Torres
>> <diego.torres@lifia.info.unlp.edu.ar> a écrit :
>> 
>> Dear all,
>> 
>> I´ve notice in the last time the appearance of knowledge graph as a keyword in the Semantic Web context. As I am from the old school, some programming technologies that I am managing are a little out of date.
>> 
>> I would like to ask if some of you could recommend a basic toolkit for a knowledge graph phd student. This includes from programming languages to frameworks, tools, and any other important tool. For example, where could use to store or manage a knowledge graph (neo4j was the last I ve seen, imagine there are new alternativas, most of them open source).
>> 
>> Thanks in advance,
>> 
>> Diego
> 
> Given such a general question, I take this as an opportunity to try to
> explain that neo4j or so called RDF datastore are very difficult to
> scale if not a dead end in a realistic scenario. Let me explain:
> 
> Things like neo4j, are really good for things that are relational,
> against which you need to do queries that are deeply recursive. To get
> started neo4j is difficult to scale in terms of data size. Second, not
> everything is relational. That is when you need to do things like
> full-text search, or typo correction or keyword suggestion you need to
> fallback to another database which puts a strain both in production
> and in developer setup. Which in turn makes it difficult to reproduce
> the setup both in production and locally. And when you do no need to
> scale beyond on single box, it is still very difficult. A little tip:
> if it takes more than one day to setup the whole system with a green
> horn on a developer machine or worse you need to buy more cloud
> credits to setup the dev environment => that is NOT a good system,
> that is not future proof, that is not good. On top of that, the
> time-to-learn is gigantic for a newbie, not only there is the whole
> setup that works like a castle of cards but there is the time required
> to learn the surface aka. the Domain Specific Languages that you need
> to know just to be able to use them (ElasticSearch JSON mess, REDIS
> LUA, SQL, Cyper or GQL).
> 
> The problem is the same with RDF stores, they do not scale in terms of
> features and use-cases, for the same reasons. This leads to an unbound
> microservices mess.
> 
> My recommendation is to learn about FoundationDB which can scale down
> to a single box and works in the large just as easily. You will need
> to LEARN something new, BUT like LISP it is a programmable programming
> system, you can do a lot more with an OKVS than with any other
> database system.
> 
> And please try something else than Python, because the Global
> Interpreter Lock is here to stay, and if you want to make your work
> reproducible and accessible, multiprocessing (or worse:
> microservices!) does not cut it. I know it is a lot of work, but I am
> confident that Python (with the GIL) is holding back progress in
> Science.
> 
> NB: I did not say "RDF is useless".
> 
>> 
>> Dr. Diego Torres
>> Centro de Investigación LIFIA
>> Facultad de Informática - UNLP
>> diego.torres[at]lifia.info.unlp.edu.ar
>> http://www.lifia.info.unlp.edu.ar/lifia/en/files/diego-torres/
> 
> This is a redirection.
> 
>> Director de http://cientopolis.org
> 
> This link does not work (404)
> 
> 
> -- 
> Amirouche ~ https://hyper.dev
> 
Received on Saturday, 2 May 2020 14:32:13 UTC

This archive was generated by hypermail 2.4.0 : Saturday, 2 May 2020 14:32:14 UTC