- From: Adrian Gschwend <ml-ktk@netlabs.org>
- Date: Sun, 3 May 2020 11:51:16 +0200
- Cc: semantic-web <semantic-web@w3.org>
On 03.05.20 08:43, Amirouche Boubekki wrote: [...] > offer. What is required is indeed a relational database like RDF > describes. But more than that, a modern AI system has to tackle > heterogeneous data types that do not blend nicely into the RDF > framework. I forgot to mention geometric data. I forgot to mention > strong ACID guarantees. I would say there is no other data model out there which can unify heterogeneous data types better than RDF. What does in your opinion "not blend nicely into the RDF framework"? > It has to do with RDF with the fact that people spread the idea that > RDF framework is a go to solution to do semantic work. Except, it does > not provide a solution for: > > - full text search nonsense, there is no standard API but pretty much every triplestore I know provides that, see https://github.com/w3c/sparql-12/issues/40 Just because it's not in current SPARQL spec does not mean it's not there at all. Also we do work on SPARQL 1.2, that's the beauty of open standards. > - geometric search https://www.ogc.org/standards/geosparql/ It's a not really well written spec but it's there since 2011 and various stores implement that, for example Jena: https://jena.apache.org/documentation/geosparql/ > - keyword suggestion (approximate string matching) see all lucene based fulltext-search implementations above > - historisation There are a whole bunch of papers about versioning RDF from a research POV, I know that at least Stardog implements that in their product. My colleague just recently wrote a versioned RDF store for distributed IoT devices so that's surely a solvable problem. While I always thought I absolutely need versioning I noticed that in reality this is far less the case, because I often model the data versioned in RDF directly so no need to get that on store level. > - ACID guarantees Again, solvable. Stardog does this for example. (https://stardog.docs.apiary.io/#reference/managing-transactions) OSS stacks have implementations as well, also there are discussions around transactions in the SPARQL 1.2 CWG: https://github.com/w3c/sparql-12/issues/83 > And probably others that I forget. you seem to have decided that RDF is not for you, this is totally fine. But YMMV, I think RDF is *the* stack to build KGs on and I have not been disappointed so far. If we miss something, we try to add it to the stack. > Two things: > > 1) For the record: money is not Science. Profitable does not > necessarily mean a Good Thing. No disagreement here but how is that related to the scaling remark? > 2) There is not publicly available project using publicly available > software that scale beyond 1TB. What you want to say is "I am not aware of a publicly available project using publicly available software that scale beyond 1TB". Also, sorry to disappoint you: https://de.slideshare.net/jervenbolleman/sparqluniprotorg-in-production-poster That was 2017, Uniprot again grew since then, latest number I have in mind is well above 50 billion triples. For larger-scale Open Source RDF implementations you might want to consider: https://cm-well.github.io/CM-Well/index.html See for example the high-level architecture here: https://cm-well.github.io/CM-Well/Introduction/Intro.CM-WellHigh-LevelArchitecture.html If you think this is too complicated please remember that Uniprot runs on a single machine using Virtuoso. There are a few other large-scale stores like Apache Rya but I did not try those yet. > Indeed, when one asks me my advice about a _basic_ toolkit to do KG, I > recommend FDB, because it can handle all the cases previously > mentioned. And also I do not to forget to mention that it is a long > journey, especially if you want to be valid in the regard of RDF > standard. That is a tooling question and that got a lot better the past years. But still work to do for sure and we work on that. > As far as I am concerned RDF offers good guiding principles, but it > requires decades long of study (much like compiler work) to grasp > which is a bummer. I ought to be simpler, much simpler and that is > what I am doing in my projects: taking the best of RDF and leaving > aside what is not necessary. I disagree here and I talk from experience. I do a lot of RDF teaching and once people understand the basics, they can be extremely productive with RDF. > exists. But I will not forsake advancement and innovation for the > purpose of backward compatibility with something that is so gigantic, > especially when something easier is possible. Again that is fine if your use-cases are limited. We leverage the power of the RDF stack so "something easier" means "something less powerful". regards Adrian
Received on Sunday, 3 May 2020 09:51:37 UTC