- From: Sean B. Palmer <sean@miscoranda.com>
- Date: Thu, 12 Oct 2017 20:27:23 +0100
- To: Martynas Jusevičius <martynas@atomgraph.com>
- Cc: SW-forum Web <semantic-web@w3.org>, public-lod@w3.org, www-archive <www-archive@w3.org>
Hello Martynas! Your concise reply ('SPARQL is "semantic junk"? OK then...') did not reach the mailing list archives for some reason, so I decided to excerpt what you said into a new thread to increase the chance that any discussion on the merits of SPARQL is published. In my email I referred to discussions made on the web over the past five years assessing the current status of the Semantic Web. One of the things that I found is that many of the core Semantic Web formats were regarded negatively, and in some cases the demise of the Semantic Web itself (in the eyes of those writing) was linked to the quality of those formats. SPARQL was amongst these formats. Though I classed SPARQL along with RDF/XML as "Semantic junk", I did this from the point of view of summarising what I had seen people discussing. This was a very brief, informal survey of the recent literature, most of it not even peer reviewed. As I went on to say in my email, I actually disagree with the point of view that the core formats including SPARQL and RDF/XML are to blame for any perceived demise of the Semantic Web. Despite this, I was never all that happy with SPARQL. There are several reasons for this, but I'll note two major ones here. The first is that when I was implementing SPARQL as part of a whole-stack implementation of the Semantic Web as it was a decade ago, I found SPARQL to be the most intractable part of the stack to work with. The next most intractable part had been the OWL species checker, but even that was some way behind. I didn't take specific notes on what annoyed me about SPARQL in the process of implementing it, but recall that the experience was not pleasant. The second major reason is that SPARQL had no homoiconicity, and was instead stratified into opaque parts. Many may now have forgotten that at the time SPARQL was being created, there were competing proposals that could have become the final "SPARQL" (and the name would therefore likely have been different too). One of those competing proposals was called N3QL, but it was overlooked because it came from a rather unknown developer who perhaps didn't seem to understand the Semantic Web: https://www.w3.org/DesignIssues/N3QL.html The author's name is Tim Something-or-other, but that's not important. As you can see from the draft, the idea of N3QL is that its syntax was not only consistent throughout, but it was also based on an existing RDF system that was already widely in use for the time. In SPARQL, the query syntax is not based on RDF: instead it bears more resemblance to SQL, presumably to ease transition to graph queries from people who are used to relational databases. But this means that you can manipulate neither SPARQL queries nor results with RDF tools, at least not without an extra translation stage, for example mapping SPARQL result bindings into RDF. Even in SPARQL 1.1, which came out in 2013, results can only be expressed in XML, JSON, CSV, and TSV. One language is suspiciously missing from this list: RDF! https://www.w3.org/TR/sparql11-overview/#sparql11-results Correct me if I'm wrong, and have overlooked something obvious (which is quite possible since I have not been working on the Semantic Web stack now for a decade), but this seems at the least very severely ironic. As another example, consider FILTER clauses in SPARQL queries. There are quite a few functions in the SPARQL 1.1 function library for filtering: https://www.w3.org/TR/2013/REC-sparql11-query-20130321/#SparqlOps None of these are first class functions in the Semantic Web as they have neither a URI to identify them, nor any place in the actual graph structure of the query, because most of the query does not have a graph structure unlike in N3QL. And this is so even though FILTER is embedded in a WHERE clause, which is otherwise the only thing in a SPARQL query that is actually expressed as a graph! In N3QL, on the other hand, filters are predicates as an implied metasyntactic construct. You may say: so what? Does it matter if the query isn't actually written in RDF? After all, in the world of XML schema we found that RELAX NG Compact was the best way of writing schemata for most common applications, and that syntax was not XML based. Though it's ironic that a non-XML syntax turned out to be the best way to express the structure of XML, that's just the way things went. So might this not be the same case for SPARQL? In the case of RELAX NG Compact, the fact that you had to use a custom parser for that format compared to an XML format was more than made up for by the readability of the result. RELAX NG Compact is much easier to write and scan for humans than the non-Compact analogue, and far easier than the W3C's XML Schema. For SPARQL, on the other hand, the fact that the queries and the results are not expressed in RDF only takes away value, and does not add. In the case of predicate filters, for example, you can imagine programmatically constructing such filters from other RDF input. The idea overall was to make RDF pervasive so that everything we do has a consistent underlying substrate on which to work. The BNF for N3, on which N3QL was based, was itself expressed in RDF for example. I wrote what is still, as far as I know, the most reliable parser for N3 using that RDF BNF. By being homoiconic, N3QL was part of the Semantic Web. In the later language of five star Open Data, SPARQL is only three star (at the same level as CSV). N3QL is five star. That's quite a lot on why I don't like SPARQL very much. I could go on further, and if I were to implement SPARQL again with contemporary tools I'm sure I would have many more things to point out! But I want to ask about the positive side of SPARQL. Obviously you objected to SPARQL being characterised as "Semantic junk", no matter who makes that characterisation, and so I figured that you were likely working with SPARQL in some capacity. (Aside: I was actually surprised that I received any response to the thread, and I am grateful for the other responses and may reply to some of those too. I was documenting a recent paucity of interest in the Semantic Web, so I was not exactly expecting there to be much interest in my findings either--especially given that they were very hastily assembled as a matter of curiosity.) I looked up your work on SPARQL, and in the process found that you had retweeted something that Adrian Gschwend said back in September of this year which struck me as a powerful statement: "I feel sorry for everyone that does not know/get/understand the power of SPARQL, RDF & the Linked Data stack. They will waste lots of time" https://twitter.com/linkedktk/status/912293004056121350 As I hope is clear from both my previous email and the present one, I am somewhat familiar with the power of RDF and Linked Data. On SPARQL specifically, though, I do feel as though it never won me over. Perhaps I understand its power inasmuch as I understand it as a stilted form of what N3QL could have become had it gone through the same standardisation process by a talented set of people, but SPARQL as an exciting technology in and of itself is still an elusive concept to me. Your defence of SPARQL so far has been rather concise, but I am genuinely interested in how SPARQL has fared these past ten years since I stopped being so familiar with the Semantic Web stack, and I am also interested in the point of views of anybody who works with and feels positively about SPARQL. Why is Adrian so sorry for those who do not understand its power? What are such people missing, and how can this power be revealed? In short, where is SPARQL's sparkle? -- Sean B. Palmer, http://inamidst.com/sbp/
Received on Thursday, 12 October 2017 19:27:48 UTC