Re: Pragmatic Problems in the RDF Ecosystem (Was: Re: Toward easier RDF: a proposal) from Henry Story on 2018-11-27 (semantic-web@w3.org from November 2018)

From: Henry Story <henry.story@bblfish.net>
Date: Tue, 27 Nov 2018 17:42:14 +0100
To: Steven Harms <sgharms@stevengharms.com>
Cc: W3C Semantic Web IG <semantic-web@w3.org>
Message-Id: <46C49F69-4C54-4915-BDDB-4F970CD0E98F@bblfish.net>
Hi Steven, you make a lot of good points below. 
I just want to add a few remarks as another developer to what you said
to give some perspective.

> On 27 Nov 2018, at 14:27, Steven Harms <sgharms@stevengharms.com> wrote:
> 
> All,
> 
> I've posted this online at my site for long-form reading (https://stevengharms.com/research/semweb-topic/2018-11-26-toward-easier-rdf/#post) but will include the full, long text below for those preferring the mail reader interface.
> 
> Esteemed SW Community,
> 
> I've been silent on this list because I am not a practising ontologist. I'm
> (just a) "middle 33% developer" who thought that making a graph of knowledge
> about books would be interesting[0]. I've tried to document[1] my experiences,
> up to the point a few weeks ago that I ground to a halt. When I saw David's
> post[2], I was excited because I thought it might occasion discussion around
> the simple, pragmatic problems that stymied me. 
> 
> I'd like to list a few signals that RDF* sends in the first hour of exploration
> to the pragmatic 33%-er (me)  that suggest that the explorer's further time
> won't pay off. I've also spent 2 hours with a near-identical (hand-wave)
> competitor, [AirTable][9], where I was able to get my prototype up and running in
> under 2 hours[10]. Based on these criticisms and comparison with the
> marketplace, a developer curious about RDF* receives ample signal to "close
> tabs, move on," and drop out of the funnel.
> 
> A. Lack of a Clear Entry Point
> ==============================
> 
> Compare "How do I write React" Google results with "How do I write RDF" Google
> results.
> 
> * React's first hit[3] is served by its authority (reactjs.org). It links
>   to a description that is compelling, welcoming, and relatively easily
>   scanned.  It's visually attractive and modern as well. It looks maintained.

(Note: React is maintained by a very rich organization)

> 
> Versus:
> 
> * RDF's first hit is hosted by w3schools.com[4] and feels scanty (NB: Not even
> * a W3C link!)
> * RDF's second hit is hosted by a site whose look and feel is akin to a
>   textbook[5] and is equally exciting
> * RDF's third hit[6] is the same
> * RDF's fourth hit [7] is the first link that starts educating on the Jena API
> 
> These sites look state of the art for the pre-Clinton era. Should one actually
> find the W3C spec, the look-and-feel there (to say nothing of the writing style
> and tone) suggests "Keep moving, peasant."
> 
> As a pragmatic 33%-er, my intuition is screaming "Close tabs; abort."

I believe you :-)

> B. Lack of Technology Framing
> =============================
> 
> Compare the React home[2] to any of those previous links [3][4][5][6]. The
> navigational tree hits topics that provide "big picture," "tools required,"
> "help if you get stuck," "what is this technology," and "when is it an optimal
> choice?" By comparison, I don't have any idea what RDF* thinks its use or chief
> benefits are.
> 
> To the pragmatic 33%-er, React's site says: "You're welcome here, prepare to be
> awesome."

I agree here. The main problem is the framing of what the point of RDF is.
And actually it is very simple if one thinks really big. I have tried to bring
together an argument from the logic of knowledge, with some fun science 
fiction illustration to explain this. 

https://medium.com/@bblfish/epistemology-in-the-cloud-472fad4c8282

To start one needs to understand why one needs decentralization. If that is
not clear the rest does not make sense. It is all about sovereignty.

> 
> C. A Highly Fractured Ecosystem
> ===============================
> 
> Said Booth:
> 
> > a painful reality has emerged: RDF is too hard for *average* developers.  By
> > "average developers" I mean those in the middle 33 percent of ability. And by
> > "RDF", I mean the whole RDF ecosystem -- including SPARQL, OWL, tools,
> > standards, etc. -- everything that a developer touches when using RDF.
> 
> While RDF is wonderfully graspable in its simplicity: triples that can be
> serialized into multiple formats; its ecosystem  of clever acronyms and
> backronyms is tedious, over-precious, and opaque.  RDF* requires the learner to
> hold too many cognitive circuits open before anything starts to resolve. React
> avoids this by doing complete layers (e.g. no classes, classes without JSX,
> classes with JSX) where complete, albeit small, artifacts are created
> repeatedly.
> 
> Most of these technologies' defining document is a W3C standard written in the
> opaque style of W3C standards (see Sporny, at length). While these standards
> cover cases exhaustively, they're difficult to understand applying to a toy
> example.  React makes tic-tac-toe from which I can extrapolate Twitter
> integrations or JavaScript widgets. RDF* has no such entry point.

As it happens I developed demo SoLiD RDF apps with Scala-JS-React a 
few years ago. https://github.com/read-write-web/react-foaf

> 
> Supposing one finds a canonical entry point, RDF* feels like it solves someone
> else's problem and not mine (close tab; bye!). 

Yes: what needs to be imparted is that RDF is about the ability to write
sovereignty respecting hyper-apps. Ie: to turn every application into
a data browser that follows the same principles as a Web Browser.

That is what the SoLiD team is constructing. And now they even
have a Startup so that they can make the buzz.

    https://www.inrupt.com/

> 
> D. Lack of Automated Feedback
> =============================
> 
> One of the greatest things that happened in learning HTML (1994, in my case)
> was the existence of validators to provide feedback of whether I was doing it
> right. The RDF* suite provides me no feedback as to whether I'm doing it right.
> When I get a serialization to parse, I can see a really pretty graph. Is that
> _right_? Is that _recommended_? No idea. It's like learning German, going to
> Germany, speaking German, and finding out that no one there will (patiently)
> correct you when you use the wrong article.
> 
> In all seriousness, I used Juan Sequeda et al's GRAFO[8] in order to have
> something generate an artifact that I could use to confirm my use of hand-coded
> RDF* and OWL.  Booth's comparison to Assembly is apt; many times developers let
> `gcc` spit out Assembly code to get validation of their tedious-to-write,
> difficult-to-edit hand codings. I say more about tooling in H, and I, below.
> 
> Where tooling is unavailable (or engineering effort costly in time / money), a
> suitable shim is possible with a (or multiple) canonical example(s).


The key validator always was the Web Browser itself. That is the first validator
most people go to when writing web pages. What has been missing from the
Semantic Web are applications that read (and write) the data as well as spread 
it.

Note that writing browsers is not an at all an easy task. React is a great tool, but as with 
most tools there are 1000 of ways of going wrong. That is why talented developers
are needed. Talented people who have example to run off learn very quickly very
complex things if taught correctly and given an insight as to why they should want
to build in a certain way. I know I taught some good developers RDF in less than a
month and we got them to develop an Address Book app in less than 2 months.  

What I found missing at the time was a well typed JS RDF libraries that would make 
programming RDF a lot easier to explain, as the compiler and IDE could help the young 
developer understand which methods could be used. Just at that time the alpha of 
Scala-JS came  out to help with that.

There is a confused reasoning that the success of the web is due to HTML being 
easy to write, and so that developing hyper-apps should be too. But that is not taking 
into account the huge engineering task that writing browsers is. So the Indy Web
folks staring from the premise that everyone should write RDF and seeing that
it is too complicated to explain to people who may be just about able to write
HTML snippets, come to the conclusion that RDF is doomed. HTML is easy to write
and for a data standard so is Turtle. 

The problem is developing the apps: the hyper-apps.

> 
> E. Lack of a Canonical Example
> ==============================
> 
> In the dawn of the JavaScript frameworks (2014-ish) _everyone_ did a TODO app.
> One could compare Angular to Ember to Knockout to BatmanJS ('memba that?)  and
> see what trade-offs the various implementers made. It was a problem with a
> trivial domain but from whose implementation one could project the technology
> learning ladder.
> 
> RDF* lacks a consistent example. Where it is consistent, it is trivially small.
> The most consistent example (in my experience) is using a `foaf:` ontology to
> make some boring and fairly shallow statement e.g. "Alice knows Bob." Great. So
> what? How do I start building classes, and predicates (schemas) and start
> creating graphs based on my ideas?
> 
> "Read more specs, pleb."
> 
> Sigh.
> 
> While it's readily obvious that we could use (the fractured ecosystem of)
> ontology providers to assert more about Alice and Bob, to create a schema is an
> entirely opaque process that isn't "ramped to" based on grokkable atoms. Where
> do I go to get more properties? Should I mix multiple ontologies? Is there an
> example? No.

There has been a long learning process in understanding what is right and wrong.
But that is a bit like writing OO classes. Everyone can write a few classes for
a "Hello World" program,  but it's a completely different level of engineer who 
invents a framework like React. And when I look at their code, or those of other
reactive frameworks and now that I know category theory, I am
quite certain that they are applying Category Theoretical theorems when thinking of their
code. 

The author of the AltaVista search engine, Mike Burrows, who also wrote papers
on distributed access control, showed me once in 1996 or so how he calculated the 
optimal number of CPU cycles mathematically for the search engine, as any waste 
would cost a huge amount of money in deployment. He wrote the core in Dec assembly 
and was unhappy that he got to half a cpu cycle of the optimum. 

> 
> F. Lack of Intermediate Canonical Example
> =========================================
> 
> This is really an extension of E, but there's a huge gulf between some foaf-y
> triviality and "Model a Medical Product Ontology." Uhm, how about something
> obvious and fun (modeling board games, or card games, books, plays..anything?)

I think that is what the SoLiD group is all about. 

> 
> G. Curiously Strong Rejection of SQL and OO as Metaphors
> ========================================================
> 
> RDF* is neither SQL nor Object-Oriented programming, but dear Mithras, SQL and
> OO are powerful, pervasive metaphors that most RDF* learners' mental models
> appeal to when they're learning. Why aren't we translating trivial OO code or
> trivial DB modeling in those metaphors to RDF*?
> 
> Considering the blood, sweat, tears, and bile I lost learning to write SQL
> construction commands I'm galled to type the following: It's easier to learn to
> write SQL tables by hand (schema as well as content) than it is to design an
> RDF* schema and load it up.
> 
> (To say nothing of the gigabytes of tutorial material, StackOverflow posts, etc.
> to help correct and steer you out of the gutter.)
> 
> I re-read this now and am staggered. RDF*'s a data format that's conceptually
> _simpler_ than SQL but which is _orders of magnitude_ harder to learn (see A-F,
> above).

Funny that. At Sun Microsystems I wrote an OO to Java mapper that I called
So(m)er: Semantic Object Metadata Mapper, and used annotations and Java
byte code rewriting tools using those to make the objects look up the data in 
an RDF database.  

There is a Category Theoretical mapping between OO programming and Logic I discovered
last year which may well explain the conceptual difficulty that people have when
switching between both OO programming and Logic worlds or even between OO and Functional
programming (in the later case OO is co-algebraic, and Functional programming tends
to be algebraic).

> 
> H. Lack of Tools
> ================
> 
> Beginners drown in the options. Booth's suggestion of a default stack (even
> better if we could get it in http://repl.it) is very much needed. Give me a
> canonical (even dumbed down) version of tools that let me work through the
> canonical examples and then I'll write Python or Ruby or use GUI abstractions
> to get out of the, per Booth, assembly language verbosity of the RDF* stack.
> 
> Many e.g. UNIX tutorials use nano (these days, I used pico back in the 90's).
> This is sensible. Trust that the learner will soon tire of the tool (or not)
> and decide to upgrade their tooling (unto `vim`, say). But by all means, make
> them effective!
> 
> Why not use use turtle or N3 or (better yet!) JSON (because people know
> it) consistently? Whichever is simpler and more neatly fits in code samples.
> Because of the hesitancy to voice a strong opinion or a good starting point,
> beginners don't know where to start and drown in the undifferentiated murk.
> 
> Close tabs; move on.
> 
> I. Obvious Moribundity of Tools
> ===============================
> 
> I first started learning about RDF* technology in Austin, TX at Cyc under the
> organizational passion of one Juan Sequeda in 2008-9. Can you imagine how
> staggered I was to find that the tooling ecosystem has made no appreciable
> progress in a literal decade? Name any other software that can see so little
> growth and still be called "vibrant." The majority of tools I downloaded
> required JVM and / or failed to start when installed locally. Web options were
> poor as well.
> 
> I rather enjoyed my trial of Grafo[8] as it's the first twitch of life I've
> seen in this space since before the Obama administration.

Guilty. I have been helping develop the banana-rdf libraries, 
  https://github.com/banana-rdf/banana-rdf
and it has been dormant for a few years now. I had to try to raise
funds as building a hyper-app is a project that needs at least a small
group of people to work together on the same language, libraries, UI
documentation etc... We tried an EU project, and somehow I am now
doing a Phd. Well it's given me time to get good at the theory, and learn
to read mathematical articles quickly. 

   But funding is an issue. As you can imaging building apps to sell personal
information to satisfy the marketing industry is a well known business model,
and sadly business people tend to think that all innovation needs to come
from engineers alone. But that is the crowd effect.  


> 
> J. Faster, More-Than "Just Barely Good Enough" Competitors
> ==========================================================
> 
> By way of comparison, I _just now_ used Airtable[9] to build my book cataloging
> proposal[1] in 2 intuitive, friendly hours and I can readily see how to extend
> it to serve my problem domain.
> 
> I grant that I'm losing the advanced query structure of SPARQL (which confuses
> me to no end and promises hours of delightful spec reading; no loss) and the
> hopes for inference, but at roughly the same time it takes to grok one of the
> 1-5 standards one has to read to use RDF*, I have something that I can provide
> as a read-only share to anyone reading this post:
> 
> https://airtable.com/shrJILw0CTILV0My2
> 
> (*and* AirTable features like collaboration, note history sans RCS, read-only
> sharing, etc.)
> 
> Airtable has existed substantially less time than RDF* and has solved a
> majority of the tool-chain, reference implementation, bootstrapping hurdles.
> React has done the same. Why as RDF*'s ecosystem so fundamentally failed to
> meet the quality, ease, and friendliness of these latecomer technologies?

Those tools have a lot of cash often because they build platforms that tie
developers in to them. Convincing the developer is a way to get to the manager
who will end up paying for the locked in service his devs produced. So beware
of marketing. 

> 
> Conclusion
> ==========
> 
> I'm sure I certainly stepped on some toes here. I'm sorry if I hurt YOUR
> feelings. No one likes to have tech they wrote or tech that they labored to get
> up and over the learning curve on whipped like this.
> 
> I also know that I'm dissmissable with:
> 
> * "Just RTFM better"
> * "If it was meant to be easy we wouldn't be getting PhDs in it"
> * "It's a specification, precision and authority outrank ease of use."
> * "Your dumb book logging idea is too simple a domain for technology this
>   powerful, use an Excel sheet, peasant."

:-D

> But I hope this can be a clarion call: commercial entities are doing similar
> work with beautiful interfaces that are intuitive and running laps around the
> RDF* universe. If the bar for RDF* remains as high as it is, the future of the
> web will be _theirs_ to decide; Facebook squashed foaf, Facebook / Google squashed
> OpenID, something like if not AirTable will squash RDF* at this rate.
> 
> Kathy Sierra said one of the most profound things I ever heard at SXSW in the
> early aughts (about the time I was dabbling with SW): "When tools are great,
> users say 'This tool is awesome'; when tools or docs are awful, users say 'I
> suck.'" After 10 years of feeling like "I suck" in RDF* land, I'm starting to
> wonder why I'm still trying.

You never talked of hyper-apps or the read-write web in your whole post.
That is the piece that you are missing. Also it won't be easy to write good
beautiful RDF apps, not because RDF is difficult, but because writing super
distributed apps like a browser is difficult. Luckily the browser is a platform
that one can build onto of with JS.

> Footnotes
> =========
> 
> *: Booth has overloaded "RDF" to mean an ecosystem. I'll be using "RDF"
> similarly.
> 
> References
> ==========
> 
>     [0]: https://stevengharms.com/research/semweb-topic/problem_statement/
>     [1]: https://stevengharms.com/research/semweb/
>     [2]: https://lists.w3.org/Archives/Public/semantic-web/2018Nov/0036.html
>     [2]: https://reactjs.org/tutorial/tutorial.html
>     [4]: https://www.w3schools.com/xml/xml_rdf.asp
>     [5]: http://www.linkeddatatools.com/introducing-rdf-part-2
>     [6]: http://www.linkeddatatools.com/introducing-rdf
>     [7]: https://jena.apache.org/tutorials/rdf_api.html
>     [8]: https://gra.fo/
>     [9]: https://airtable.com
>     [10]: https://airtable.com/shrJILw0CTILV0My2
> 
> -- 
> Steven G. Harms
> PGP: E6052DAF
Received on Tuesday, 27 November 2018 16:42:41 UTC