- From: Orri Erling <erling@xs4all.nl>
- Date: Sun, 15 Mar 2009 23:04:00 +0100
- To: "'Lee Feigenbaum'" <lee@thefigtrees.net>, "'Ivan Mikhailov'" <imikhailov@openlinksw.com>
- Cc: "'Chimezie Ogbuji'" <ogbujic@ccf.org>, "'Seaborne, Andy'" <andy.seaborne@hp.com>, "'SPARQL Working Group'" <public-rdf-dawg@w3.org>
Hi To condense the matter, I would repeat our position that expressivity parity with SQL is a minimum starting point for a credible goal statement of the present work. We have implemented SQL and the matching SPARQL extensions, thus this is known to be possible. Of course, we have also done things that are not in SQL proper, such as transitivity and subclass/subproperties etc. But the latter are a nice to have whereas the SQL parity is a minimum. In short, SQL parity means: subselects with modifiers like limit, expressions, aggregates/group by and that's about it. For ease of communication, we have used SQL-like syntaxes but then we are not very dogmatic on syntax. The working group may come up with something prettier, in which case we will adopt this. SQL has certainly been implemented a great many times by plenty of people, and while this is a fair bit of work it is eminently feasible. Implementability is certainly a concern, but if we are serious, that which is a commonplace for the database world cannot be a blocker for us. We hear from diverse quarters that semantics are needed for the database. Well, the addition of semantics cannot mean that the query language will be less than what people in the database mainstream have taken for granted since SQL 92. SPARQL does not live in a vacuum. Or at the very least, it ought not to live in a vacuum if it aspires to bringing semantics to the database world. On let vs. subselect, we note, as Andy said, that neither is recursive. Subselects are a means of giving names to expressions, as is let. But subselects allow expressing things like grouping and existence, which let does not. In this sense subselects are more primary and let can be macroexpanded into a subselect. An exists subquery, as well as a scalar subquery can be expressed as a subselect (derived table or inlined view in SQL terms). We do not so much care about the syntax but consider the basic capability to be essential. Back when an incubator group about SPARQL benchmarking was being considered, in early 2008, there was some discussion about whether the message ought to be that SPARQL is a ready SQL replacement or whether SPARQL is something that can take databasing to new levels of expressivity where SQL is not an option. What ended up happening was that the Berlin SPARQL benchmark was made by Chris Bizer and Andreas Schultz and ended up being essentially a simplified SQL query workload, missing the nesting, grouping and aggregation that make SQL such a flexible language. Sure, this serves a purpose. But this is not what will convince the database world to adopt SPARQL. If we are to go beyond SQL to a world of querying with semantics across the Internet and not over a single database, then credibility starts where at least all that can be said in SQL can be said with equal ease in SPARQL. Of course, being just even with SQL is not something that will in itself justify SPARQL. Thus the message is that RDF/OWL + SPARQL takes querying one level of abstraction up through the introduction of subclasses, subproperties, schema last, owl:sameAs, transitivity and such things as can be had from integrated inference. In integration scenarios, benefit from reuse of ontologies etc is demonstratable enough. But the demonstration avails little if the basics are lost. As specifically concerns Lee's point of preferring the already implemented, all the things that we see as really key are implemented by ourselves and to a great extent at least by Arq. We almost have the exit criteria of 2 implementations even now if we just can agree about syntax for nesting queries, a lot of the rest is near interoperable already. Orri number of SQL
Received on Sunday, 15 March 2009 22:05:48 UTC