Re: Semantic Web Interest Group now closed from Dan Brickley on 2018-10-18 (semantic-web@w3.org from October 2018)

From: Dan Brickley <danbri@danbri.org>
Date: Thu, 18 Oct 2018 14:09:48 -0700
To: Dave Raggett <dsr@w3.org>
Cc: martin@weborganics.co.uk, Story Henry <henry.story@bblfish.net>, Nicolas Chauvat <nicolas.chauvat@logilab.fr>, frans.knibbe@geodan.nl, Semantic Web <semantic-web@w3.org>
Message-ID: <CAFfrAFr1rUtWVGCoNtRRcbMYD_zUu3Kx27tT11BoixUMp-LkQw@mail.gmail.com>
So we've had sentient web, hyper data, data web, and a bunch of other
suggestions, on top of our historical attempts at calling "it" the Semantic
Web, Linked Data, LOD, PICS, PICS-NG etc., plus recently "knowledge graph"
gaining rapid traction.

May I gently suggest that the name isn't the core problem here? Except
perhaps that we keep trying to respin things via renaming.

There are serious frustrations that come with trying to use RDF (and
RDFS/OWL/SPARQL, JSON-LD, RDFa, Turtle, N-Triples et al.), and lack of
evocative names is rarely top of the list. Part of our cultural problem
here has often been a kind of defensiveness that comes from our approaches
often being eclipsed by more mainstream technologies. And with that
defensiveness sometimes a sense of "if only we could get the messaging /
pitch / tutorial right, the unbelievers would come to see the beauty and
simplicity of our approach".

For a long time, RDF's annoyingness was somewhat conflated with it's
syntax. The initial RDF/XML syntax was put together in discussions which
focussed more on the underlying graph data model.

We called it a "striped" syntax because XML elements alternately stood for
nodes vs edges of the underlying graph (https://www.w3.org/2001/10/stripes/).
TimBL's forgotten Notation 2 was an attempt at a unstriped, edge-centric
syntax. We've had near countless efforts. GRDDL was a well motivated
attempt to make a system for mapping arbitrary XML into our graph; it seems
to have completely failed. The much more successful JSON-LD is in some ways
a similarly motivated attempt to do something rather similar with JSON, via
its expressive @context mechanism. Recently I've come to suspect that there
is something in this direction which mixes in schema/validation
considerations, so that we can map more gracefully to (e.g. binary) JSON,
relational and other data models such as Protocol Buffers.  So ShEx and
SHACL (or vice-versa) and increasingly important, as they bridge the
wishy-washy "anyone can say anything about anything" representational model
of RDF with the perfectly human desire to have things specified a bit more
tightly at the application level.

I love the way the RDF Validation book puts it, in terms of "defensive
programming". From http://book.validatingrdf.com -

"Veteran users of RDF and SPARQL have confronted the problem of composing
or consuming data with some expectations about the structure of that data.
They may have described that structure in a schema or ontology, or in some
human-readable documentation, or maybe expected users to learn the
structure by example. Ultimately, users of that application need to
understand the graph structure that the application expects."

"While it can be trivial to synchronize data production and consumption
within a single application, consuming foreign data frequently involves a
lot of defensive programming, usually in the form of SPARQL queries that
search out data in different structures. Given lots of potential
representations of that data, it is difficult to be confident that we have
addressed all of the intended ways our application may encounter its
information."

This characterization I think is much closer to the truth than our
historical tendency to blame ugly or unintuitive syntaxes. RDF graphs are
annoying to build things with because you never know what's in them,
generally speaking. Edd Wilder-James (aka Dumbill) once likened coding with
RDF as something like coding without any data structures beyond a
hashtable. There's truth in that too.

If there is to be value in having continued SW/RDF groups around here, it's
much more likely to be around practical collaboration to make RDF less
annoying to work with, rather than high level spinning of it in terms of
different metaphors and slogans and exhortations for how people should be
doing it to be doing it "right". We have collectively slipped too easily
into the latter, and maybe we're doing it again this week. There is enough
around RDF to be tempting, evocative, to draw people in, to get them
interested. But people repeatedly hit a wall, and often wander away,
frustrated. Another reason to nudge our focus towards the likes of SHACL
and ShEx is that they are technologies that potentially can be used to
characterize specific application information needs where applications are
using some-but-not-all RDF data. As a community (especially the scientific
/ scholarly side), Semantic Web has tended towards prizing generality above
all else. But there is merit too in knowing about applications whose scope
is much more pedestrian. It is more than fine for an application to consume
just a few patterns, from the infinite gallery of possible, conceivable,
RDF graph patterns. And yet as a community we have tended implicitly to
look down upon these as missing out on the ultra-general-purpose nature of
our technology. If we are not careful, RDF is something of a spork; a
highly versatile tool potentially useful for many tasks, and yet neglected
in favour of the less general (the spoons and forks whose capabilities it
gracefully generalizes and unifies...). All the renamings and rebrandings
in the world won't save us from the tragi-niche fate of the spork, but some
collaboration around the user and developer experience, and explorations of
how syntactic issues (JSON, Protobufs, XML) relate to RDF validation
mechanisms could imho make a big difference to the appeal of our
technologies...

Dan
Received on Thursday, 18 October 2018 21:10:54 UTC