- From: Alexandre Bertails <alexandre@bertails.org>
- Date: Thu, 24 Jul 2014 14:30:42 -0400
- To: "public-ldp-wg@w3.org" <public-ldp-wg@w3.org>
All, I have been thinking a lot about the SPARQL subset idea and I would like to share some thoughts. As you could have expected from the last call, I am not in favor of it, so I have taken the time to document my issues with the approach. First, let me remind you the scope of LD Patch. It is PATCH format for partial updates of LDP-RS. So it's only about RDF graphs. It is not intended for updating quad stores, nor named graphs. Also, it is not meant to be a high-level language but rather an assembly one. For that reason, the editors challenged themselves for not adding higher-level features. Skolemization is not used. The assumption is that bnodes form tree structures. The idea is that most of those trees (and the bnodes in them) can be distinguished by filtering on sub-components of those trees. I recommend [1] for a recent and thorough analysis confirming those assumptions. That is the very reason behind the LD Path (no 'c') algebra, which shares some similarities with XPath. They are applied left-to-right, and recursively for path constraints. The semantics formally specifies the order in which those operations must be evaluated. So LDP application writers can rely on that semantics for runtime characteristics, for example by restraining the node sets as early as possible in the path, by probably starting from the leaves of the tree, and then moving up in the tree, until reaching the bnode. So, SPARQL. Yes, you can consider a subset with similar expressive power. People seem to think that defining the concrete syntax would be enough, and that it would be as easy if not easier than LD Patch. I disagree. First, the two concrete syntaxes would share a lot of the production rules, basically all the ones borrowed from Turtle. The additional ones are no issue in both cases. Then, I have heard people saying that we wouldn't need to write down the operational semantics, because we could say it's the same than SPARQL Update, but for that subset of the syntax. I disagree. Because as a developer and as a user, I would have to be sure I understand well the SPARQL semantics to either implement LD Patch (if I don't want to depend on an existing SPARQL implementation), or to use it. So I'd argue that the semantics _has_ to be written. And I'd have to reject valid SPARQL Update queries which are not in the subset. Another issue is that we will still need Basic Graph Patterns, the (S P O .)-s in the WHERE clause, which rely on intermediate ResultSet-s for their semantics. For example: Bind ?event <http://conferences.ted.com/TED2009/> /-schema:url[/schema:startDate="2009-02-04"]/schema:location[/schema:name="Long Beach, California"][/schema:geo[/schema:latitude][/schema:longitude]] would be equivalent to something like that: WHERE { ?event schema:url <http://conferences.ted.com/TED2009/> . ?event schema:startDate "2009-02-04" . ?event schema:location ?loc . ?loc schema:name "Long Beach, California" . ?loc schema:geo ?geo . ?geo schema:latitude [] . ?geo schema:longitude [] . } If we want the same performance characterics (mainly, predictability), we would have to refine the SPARQL semantics so that the order of the clauses matters (ie. no need to depend on a query optimiser). And we would need to do some static analysis on the query to make sure that ResultSet-s are not needed. In any case, it goes beyond the idea of using subset of the syntax + a pointer to SPARQL Update semantics. Another problem is the support for rdf:list. I have just finished writing down the semantics for UpdateList and based on that experience, I know this is something I want to rely on as a user, because it is so easy to get it wrong, so I want native support for it. And I don't think it is possible to do something equivalent in SPARQL Update. That is a huge drawback as list manipulation (eg. in JSON-LD, or Turtle) is an everyday task. So to summarize my issues with the approach: 1. semantics is not that easy to define 2. performance characteristics 3. no native support for rdf:list 4. needs to explain to the user how it differs from existing SPARQL Update SPARQL Update is good at doing what it was designed for, but there is little interest in being syntax compatible with it. Regards, Alexandre [1] http://www.websemanticsjournal.org/index.php/ps/article/view/365
Received on Thursday, 24 July 2014 18:31:09 UTC