Re: Proposal for “per-row blank node maps” in R2RML

David, I owe you a response to this, but didn't have the time to write it up yet, and will be going offline for a couple days now. Will get to it once back.

Richard


On 14 May 2012, at 17:44, David McNeil wrote:

> On Thu, May 10, 2012 at 5:33 PM, ashok malhotra <ashok.malhotra@oracle.com> wrote:
> I have seen no mail about this so let me ask if we can all agree to
> Richard's proposal below
> 
> I don't think we should make this change to R2RML. 
> 
> The premise behind the effort to add this to R2RML is that the Direct Mapping cannot be implemented on R2RML without changing R2RML. I think that premise is incorrect. I think the Direct Mapping of tables without keys can be implemented on R2RML by using R2RML Views.
> 
> At the risk of over-simplifying I think this is the situation:
> 
> * for valid reasons the Direct Mapping wants to preserve the cardinality of duplicate rows in tables without keys
> * for the case of query translation (i.e. treating the RDF as virtual triples), there is not a standard SQL way of solving this problem. But, for most databases there is are reasonable vendor-specific SQL queries to accomplish this.
> * so a Direct Mapping for a given table in a given vendor's database can produce an R2RML mapping that includes a vendor-specific R2RML view to accomplish the blank node generation that preserves the cardinality.
> * there may be some database somewhere for which this is impossible/impractical. That is ok.
> 
> So I propose that we leave the Direct Mapping and the R2RML spec as-is with respect to duplicate rows in tables without keys.
> 
> We at Revelytix discussed the current proposal and our specific comments/questions are inline below.
> 
> -David 
>  
> ====
> 
> • If the term type is rr:RowBlankNode, then you don't specify rr:column/rr:template/rr:constant, and you get a fresh blank node for each row. (That's the new part.)
> 
> I think the proposed spec changes need clarification defining within what context the same fresh blank node is valid. For example, what if the same TriplesMap is employed multiple times in a given SPARQL query?
> 
> Conforming R2RML processors MAY treat R2RML mappings that use per-row blank node maps over R2RML views as an error.
> 
> This makes me nervous because it seems to be strongly hinting at the underlying implementation details. However, without understanding the implementation details this restriction would not make sense to users. I think this indicates that the idea of a RowBlankNode is a leaky abstraction.
>  
> I would need to think about this more, but it seems that we can only support the RowBlankNode feature in cases where a single table is being queried? I am not sure I have my head around the way that other R2RML features can be combined that would create a context in which a table is being joined and thus a row id is not available.
> 
> It is possible to define multiple per-row blank node maps over a single logical table. In this case, each of the maps produce distinct blank nodes. In the following example, two unique blank nodes are generated for each logical table row, one as the subject and one of the object of the generated ex:p triples.
> 
>    <#map1>  a rr:TriplesMap;
>        rr:logicalTable<#someLogicalTable>;
>        rr:subjectMap [ rr:termType rr:RowBlankNode; ];
>        rr:predicateObjectMap [
>            rr:property ex:p;
>            rr:objectMap [ rr:termType rr:RowBlankNode; ];
>        ];
>        .
> 
> I struggle to think of a motivating use-case for generating two unique blank nodes for a given logical row. This seems to me like we are trying to add too much to R2RML.
>  
> But in the following example, each generated triple will have the same blank node as subject and object, because the same per-row blank node map is reference as the subject map and object map.
> 
>    <#map1>  a rr:TriplesMap;
>        rr:logicalTable<#someLogicalTable>;
>        rr:subjectMap<#blankNodes>;
>        rr:predicateObjectMap [
>            rr:property ex:p;
>            rr:objectMap<#blankNodes>;
>        ];
>        .
>    <#blankNodes>  rr:termType rr:RowBlankNode.
> ]]
> 
> This seems quite esoteric to me. I submit that the two mappings above would be considered equivalent by most users, but the proposal is to bury subtly different behavior in these. I think that is a bad idea.
> 
> Section 9.1 of the R2RML draft says:
> 
> "If the same blank node identifier occurs in multiple RDF triples that are in the same graph, then the triples will share the same blank node"
> 
> This was addressing how two separate blank node term maps could produce references to the same blank node. Does it make sense to talk about doing the same thing with this new RowBlankNode feature?

Received on Tuesday, 15 May 2012 20:49:18 UTC