Re: Blank nodes must DIE! [ was Re: Blank nodes semantics - existential variables?] from Chris Mungall on 2020-07-03 (semantic-web@w3.org from July 2020)

From: Chris Mungall <cjmungall@lbl.gov>
Date: Fri, 3 Jul 2020 12:35:41 -0700
To: David Booth <david@dbooth.org>
Cc: Patrick J Hayes <phayes@ihmc.us>, semantic-web <semantic-web@w3.org>
Message-ID: <CAN9AifsFeHQMq+ycJbRmBFcUPUOjC_oi+zkhT4c92Tp5Qhe_wQ@mail.gmail.com>
On Tue, Jun 30, 2020 at 3:33 PM David Booth <david@dbooth.org> wrote:

> On 6/30/20 3:12 PM, Patrick J Hayes wrote:
>  >> On Jun 30, 2020, at 9:40 AM, David Booth wrote:
>  >> I REALLY wish that some PhD students would take on this
>  >> challenge: to design a higher-level successor to RDF,
>  >> with a top-line goal of making it easy enough for AVERAGE
>  >> developers (middle 33% of skill), who are new to it, to be
>  >> consistently success.
>  >
>  > Might that be (a subset of?) OWL2 using the Manchester syntax?
>
> I doubt it, even though the Manchester syntax does make OWL much more
> understandable than OWL-in-Turtle.  Two reasons:
>
>   - I think OWL itself is too hard for average developers (mid 33%).
> Although the various OWL constructs in isolation -- expressed in
> Manchester syntax, at least -- are understandable enough, average
> developers (the onese I've seen) don't exhibit the precise careful
> reasoning of a logician.  And they don't approach applications as a
> logician would, by starting with a few iron-clad axioms and rules that
> they've thought long and hard about, adding data, and then turning a big
> reasoner crank to get the desired results.  They approach applications
> more operationally, through a series of small steps that they can
> successively implement and test, to eventually produce the desired
> result at the end.
>
>   - The majority of RDF (or graph database) applications that I see are
> much more like big data integration problems than semantic inference
> problems, and they typically do not need OWL.
>
> There certainly are some projects that make important beneficial use of
> OWL -- based on the OBO Foundry ontologies, for example -- but from what
> I've seen, they're not generally done by *average* developers.  There's
> usually a PhD or two involved.
>

I can speak to the OBO (Open Bio Ontologies) experience, yes, we definitely
make heavy use of OWL, but I'm not sure how much it speaks for or against
your point.

OWL is crucial for *construction* of the kinds of large ontologies
necessary in the biosciences, underpinning dozens of multi-million dollar
research projects and analytic activities of massive numbers of
researchers. Yes, a lot of this is done by PhDs... in biosciences, not CS.
Most of the ontology construction is done by domain scientists, with a very
tiny pool of people developing the tooling that supports them (Protege,
ROBOT, OWL Reasoners).

However, most downstream developers, data scientists, and domain scientists
interact with simpler graph-oriented representations (typically not RDF:
the layering of OWL onto RDF is dreadful). Which is fine, as this is an
appropriate level of abstraction for the task at hand.

This thread seems to be about making things easier for developers, which is
great. But it's challenging to come up with ways to make RDF universally
easier. What makes things simpler for some use cases will increase
complexity for other use cases. Plain RDF with no blank nodes is nice and
simple for simple problems but when you need to do something more complex,
you'll push that complexity elsewhere.

RDF can be hard but I don't think it's a matter of PhDs or not. If you give
people the right level of abstraction for what they need to do then this
can make up for rough edges in documentation etc. If you force a level of
abstraction on people that doesn't match their use case, it doesn't matter
how many PhDs they have or how much documentation exists.

From my own narrow perspective, the single thing that would make RDF more
successful would be universal adoption of labeled property graphs, RDFStar,
SPARQLStar, a standardized CSV/TSV format for semantic LPGs, and an
alternative OWL layering (see
https://douroucouli.wordpress.com/2019/07/11/proposed-strategy-for-semantics-in-rdf-and-property-graphs/
 and https://github.com/cmungall/owlstar). This level of abstraction would
hide/eliminate most of the blank nodes I see, and would give people the
level of abstraction they really want for modeling, and would match up with
the tools and databases people use outside our semantic web bubble.

But I appreciate this would cause complexity elsewhere, e.g. implications
for intuitive json-ld. No magic bullet.


>
> Anyway, that's what I've seen.  Others might have different views.
>
> David Booth
>
>
Received on Friday, 3 July 2020 19:36:09 UTC