- From: Richard Cyganiak <richard@cyganiak.de>
- Date: Tue, 7 Sep 2010 07:34:42 +0100
- To: Ivan Mikhailov <imikhailov@openlinksw.com>
- Cc: RDB2RDF WG <public-rdb2rdf-wg@w3.org>
Hi Ivan, Thanks for these insightful comments and for sharing your experience. You mention six features that would need to be added to this draft to make it work on par with RDF Views. I don't agree for all of them, but you are definitely right about some. So let's see ... On 30 Aug 2010, at 09:59, Ivan Mikhailov wrote: >> The strawman is here: >> >> http://www.w3.org/2001/sw/rdb2rdf/wiki/R2RML_in_a_custom_syntax > > Now let's extend the draft with options. > > Let us assign an (optional) name for every triple pattern of these > templates, for diagnostic purposes. I'm not convinced that this is really necessary. For diagnostic purposes, seeing a string representation of the triple pattern is good enough, such as this: ?emp biz:fullname ?name . Is that much gained by assigning a name to the pattern so that diagnostic output can show this :pattern_emp_fullname instead? > Let's cut the SQL select in parts and combine them automatically, > otherwise it is impossible to compose an adequate SQL join for a given > basic graph pattern. That's what D2RQ does -- basically the mapping author has to cut the SQL into bits and pieces, and state explicitly which of them are conditions, joins, expressions for property values and so on. It took me a while, but since joining this WG I've come around to believing that this is not necessary. If the SQL query is simple, then the RDB2RDF engine can parse the SQL query, and cut it into the required parts automatically. So you get your optimized joins. If the SQL query is complex, then the RDB2RDF engine has two options. First, it could treat the SQL query as a black box and use subselects for execution. This will be slow (or not, depending on the optimizer -- Souri and Juan have repeatedly asserted that this works just fine on Oracle and SQL Server). But at least it gives correct results, and the user can always try to write a simpler query. Second, it could reject the SQL query as too complex, and just tell the user: "Sorry, I don't support aggregates in the SQL view definition." This would be an incomplete implementation of the standard, but today every RDB2RDF implementation has certain limitations in the expressivity of its mappings, so this wouldn't make matters worse. Finally, if the engine can modify the DB, then it can just define a physical view in the DB. So I don't think that the mapping language really needs to force the mapping author to decompose the SQL query into small parts. The engine can do it. > Let us enrich IRITEMPLATEs by adding options absolutely needed for the > optimizer (and by making them based on functions when needed). Let's > specify the order of patterns and let some "exceptions" take > priority on > "common cases", to cut useless unions. Ok, this one is really interesting, can you give some details here? Pointers to Virtuoso documentation are fine. > Let's manipulate them (add/remove/reorder) and let's do that in parts, > because many independent applications may share one RDF storage and > they > can be removed as well as installed. Where do you see the obstacle in doing that with a text-based format? The draft also supports giving IDs to view maps. So an engine could have features to enable/disable individual view maps by ID. > We can also start from other end --- given the current syntax of RDF > Views, try to remove features to make it simple. Unfortunately, for > every given feature there will be a sample data mapping and a query > that > will go slower at least order of magnitude without the excluded > feature. Two of the four features above have no performance impact but are only about management of complex mappings, and a third can be handled without performance impact, so I think this is an overstatement. > P.S. I forgot to mention two more extensions, for free text Yeah that's a hard one. Can you say something about the way free text is handled in RDF Views? (No free text in D2RQ, so I have no experience here.) > and for LITERALTEMPLATES :( Well, literal transformations can be done in SQL code in the SELECT clause, so the expressivity is already there. The reason for URI templates is really that they can be re-used many times throughout a mapping file. I think literal templates are not re-used as often as URI templates, and they don't benefit as much from the syntactic sugar that URI templates can offer (URIs can only contain certain characters, so we can add template stuff without too much pain from character escaping). So I'm not convinced that it's worth having literal templates. But adding them would be easy enough without making the draft more complicated. Best, Richard > >
Received on Tuesday, 7 September 2010 06:35:18 UTC