Re: R2RML doubts: Inverse Expressions and No Join Conditions from Benjamin Cogrel on 2019-01-08 (semantic-web@w3.org from January 2019)

From: Benjamin Cogrel <benjamin.cogrel@bcgl.fr>
Date: Tue, 8 Jan 2019 21:31:18 +0100
To: Aidan Hogan <aidhog@gmail.com>, SW-forum <semantic-web@w3.org>
Message-ID: <b27007d4-5c9e-9682-8077-fe6f6f77e558@bcgl.fr>

Hi,

Regarding the inverse expressions, they are particularly interesting for 
the virtual RDF graph setting, where SPARQL queries are translated into 
SQL queries.

Consider the following example:

R2RML:
<TriplesMap1>
     a rr:TriplesMap;
     rr:logicalTable [ rr:sqlQuery "SELECT \"email\", LOWER(\"surname\") 
AS \"id\" FROM \"table1\""; ] ;
     rr:subjectMap [
        rr:template "http://example.org/person/{\"id\"}" ;
        rr:inverseExpression "{\"surname\"} = UPPER({\"id\"})";
     ];
     rr:predicateObjectMap [
       rr:predicate        myvoc:email ;
       rr:objectMap        [ rr:column "\"email\"" ]
     ] .

SPARQL:
SELECT ?e {
  <http://example.org/person/james> myvoc:email ?e .
}

Because the LOWER function is not injective, without the inverse 
expression, the SQL query would have to be:
SELECT email AS e
FROM table1 t
WHERE LOWER(t.surname) = 'james'

While with the inverse expression, one can optimize the SQL query as 
follows:
SELECT email AS e
FROM table1 t
WHERE t.surname = 'JAMES'

If the "surname" column is indexed, then the second query should be 
significantly faster.

These expressions are said to be "inverse" because they help to 
transform RDF constants into DB constants
(while the standard direction of R2RML is from DB constants to RDF ones).

This feature looks interesting, however I am not aware of any virtual 
RDF graph system that supports it.
But that's something we have in mind for Ontop.

Best,
Benjamin


On 07/01/2019 17:20, Aidan Hogan wrote:
> Hi all,
>
> I'm writing a section about R2RML for a book and I have two doubts:
>
>
> * Inverse expressions: I think I get the idea technically speaking, 
> but I don't understand in what use-case or scenario it would be useful 
> to do such a reverse search (or, more generally, what was the 
> motiviation to define these expressions in the standard).
>
>     https://www.w3.org/TR/r2rml/#inverse
>
> (Searching around on the Web, a lot of material just seems to 
> paraphrase the standard, but unfortunately the standard itself is not 
> clear to me.)
>
>
> * For referencing object maps without a join condition, the 
> definitions suggest that this acts like a Cartesian product (and 
> intuitively this would make a lot of sense to me) but the standard 
> provides an example where the parent and child maps refer to the same 
> table that seems to contradict that idea.
>
>     https://www.w3.org/TR/r2rml/#ex-ref-object-map2
>
> The text above states:
>
>     "No join condition is needed as both triples maps use the same 
> logical table (the base table DEPT)."
>
> While it's not clear what it means that "no join condition is needed", 
> what I think it's trying to tell me is that when the parent and child 
> tables are the same, by default the same row will be used in the 
> parent and child map to generate the subject/object. But the 
> definitions do not seem to support that and unfortunately the concrete 
> example provided is ambiguous in the standard because it only has one 
> row (so the Cartesian product interpretation and the row-by-row 
> interpretation give the same result, as seen). I'm wondering what was 
> the intention here?
>
>
> (Perhaps there is a better place to ask these questions, but it seems 
> the list public-rdb2rdf-wg@w3.org has been deactivated. Also just to 
> add that in general I quite like the R2RML language design and find it 
> quite intuitive; just these two points have me really puzzled.)
>
> Cheers!
> Aidan
>

Received on Tuesday, 8 January 2019 20:31:49 UTC