Re: What Happened to the Semantic Web? from Sean B. Palmer on 2017-10-12 (public-lod@w3.org from October 2017)

From: Sean B. Palmer <sean@miscoranda.com>
Date: Thu, 12 Oct 2017 20:27:23 +0100
To: Martynas Jusevičius <martynas@atomgraph.com>
Cc: SW-forum Web <semantic-web@w3.org>, public-lod@w3.org, www-archive <www-archive@w3.org>
Message-ID: <CAH3-oEfqmRg0W+2MxcKSw8hojVHbgPS3+_DhmYxxivCsPKLmPg@mail.gmail.com>
Hello Martynas!

Your concise reply ('SPARQL is "semantic junk"? OK then...') did not
reach the mailing list archives for some reason, so I decided to
excerpt what you said into a new thread to increase the chance that
any discussion on the merits of SPARQL is published.

In my email I referred to discussions made on the web over the past
five years assessing the current status of the Semantic Web. One of
the things that I found is that many of the core Semantic Web formats
were regarded negatively, and in some cases the demise of the Semantic
Web itself (in the eyes of those writing) was linked to the quality of
those formats. SPARQL was amongst these formats.

Though I classed SPARQL along with RDF/XML as "Semantic junk", I did
this from the point of view of summarising what I had seen people
discussing. This was a very brief, informal survey of the recent
literature, most of it not even peer reviewed. As I went on to say in
my email, I actually disagree with the point of view that the core
formats including SPARQL and RDF/XML are to blame for any perceived
demise of the Semantic Web.

Despite this, I was never all that happy with SPARQL. There are
several reasons for this, but I'll note two major ones here. The first
is that when I was implementing SPARQL as part of a whole-stack
implementation of the Semantic Web as it was a decade ago, I found
SPARQL to be the most intractable part of the stack to work with. The
next most intractable part had been the OWL species checker, but even
that was some way behind. I didn't take specific notes on what annoyed
me about SPARQL in the process of implementing it, but recall that the
experience was not pleasant.

The second major reason is that SPARQL had no homoiconicity, and was
instead stratified into opaque parts. Many may now have forgotten that
at the time SPARQL was being created, there were competing proposals
that could have become the final "SPARQL" (and the name would
therefore likely have been different too). One of those competing
proposals was called N3QL, but it was overlooked because it came from
a rather unknown developer who perhaps didn't seem to understand the
Semantic Web:

https://www.w3.org/DesignIssues/N3QL.html

The author's name is Tim Something-or-other, but that's not important.
As you can see from the draft, the idea of N3QL is that its syntax was
not only consistent throughout, but it was also based on an existing
RDF system that was already widely in use for the time. In SPARQL, the
query syntax is not based on RDF: instead it bears more resemblance to
SQL, presumably to ease transition to graph queries from people who
are used to relational databases. But this means that you can
manipulate neither SPARQL queries nor results with RDF tools, at least
not without an extra translation stage, for example mapping SPARQL
result bindings into RDF.

Even in SPARQL 1.1, which came out in 2013, results can only be
expressed in XML, JSON, CSV, and TSV. One language is suspiciously
missing from this list: RDF!

https://www.w3.org/TR/sparql11-overview/#sparql11-results

Correct me if I'm wrong, and have overlooked something obvious (which
is quite possible since I have not been working on the Semantic Web
stack now for a decade), but this seems at the least very severely
ironic.

As another example, consider FILTER clauses in SPARQL queries. There
are quite a few functions in the SPARQL 1.1 function library for
filtering:

https://www.w3.org/TR/2013/REC-sparql11-query-20130321/#SparqlOps

None of these are first class functions in the Semantic Web as they
have neither a URI to identify them, nor any place in the actual graph
structure of the query, because most of the query does not have a
graph structure unlike in N3QL. And this is so even though FILTER is
embedded in a WHERE clause, which is otherwise the only thing in a
SPARQL query that is actually expressed as a graph! In N3QL, on the
other hand, filters are predicates as an implied metasyntactic
construct.

You may say: so what? Does it matter if the query isn't actually
written in RDF? After all, in the world of XML schema we found that
RELAX NG Compact was the best way of writing schemata for most common
applications, and that syntax was not XML based. Though it's ironic
that a non-XML syntax turned out to be the best way to express the
structure of XML, that's just the way things went. So might this not
be the same case for SPARQL?

In the case of RELAX NG Compact, the fact that you had to use a custom
parser for that format compared to an XML format was more than made up
for by the readability of the result. RELAX NG Compact is much easier
to write and scan for humans than the non-Compact analogue, and far
easier than the W3C's XML Schema. For SPARQL, on the other hand, the
fact that the queries and the results are not expressed in RDF only
takes away value, and does not add. In the case of predicate filters,
for example, you can imagine programmatically constructing such
filters from other RDF input.

The idea overall was to make RDF pervasive so that everything we do
has a consistent underlying substrate on which to work. The BNF for
N3, on which N3QL was based, was itself expressed in RDF for example.
I wrote what is still, as far as I know, the most reliable parser for
N3 using that RDF BNF. By being homoiconic, N3QL was part of the
Semantic Web. In the later language of five star Open Data, SPARQL is
only three star (at the same level as CSV). N3QL is five star.

That's quite a lot on why I don't like SPARQL very much. I could go on
further, and if I were to implement SPARQL again with contemporary
tools I'm sure I would have many more things to point out! But I want
to ask about the positive side of SPARQL. Obviously you objected to
SPARQL being characterised as "Semantic junk", no matter who makes
that characterisation, and so I figured that you were likely working
with SPARQL in some capacity.

(Aside: I was actually surprised that I received any response to the
thread, and I am grateful for the other responses and may reply to
some of those too. I was documenting a recent paucity of interest in
the Semantic Web, so I was not exactly expecting there to be much
interest in my findings either--especially given that they were very
hastily assembled as a matter of curiosity.)

I looked up your work on SPARQL, and in the process found that you had
retweeted something that Adrian Gschwend said back in September of
this year which struck me as a powerful statement:

"I feel sorry for everyone that does not know/get/understand the power
of SPARQL, RDF & the Linked Data stack. They will waste lots of time"

https://twitter.com/linkedktk/status/912293004056121350

As I hope is clear from both my previous email and the present one, I
am somewhat familiar with the power of RDF and Linked Data. On SPARQL
specifically, though, I do feel as though it never won me over.
Perhaps I understand its power inasmuch as I understand it as a
stilted form of what N3QL could have become had it gone through the
same standardisation process by a talented set of people, but SPARQL
as an exciting technology in and of itself is still an elusive concept
to me.

Your defence of SPARQL so far has been rather concise, but I am
genuinely interested in how SPARQL has fared these past ten years
since I stopped being so familiar with the Semantic Web stack, and I
am also interested in the point of views of anybody who works with and
feels positively about SPARQL. Why is Adrian so sorry for those who do
not understand its power? What are such people missing, and how can
this power be revealed? In short, where is SPARQL's sparkle?

-- 
Sean B. Palmer, http://inamidst.com/sbp/
Received on Thursday, 12 October 2017 19:27:49 UTC