Re: Direct Mapping from Richard Cyganiak on 2010-09-07 (public-rdb2rdf-wg@w3.org from September 2010)

From: Richard Cyganiak <richard@cyganiak.de>
Date: Tue, 7 Sep 2010 04:31:46 +0100
To: Juan Sequeda <juanfederico@gmail.com>
Cc: "Eric Prud'hommeaux" <eric@w3.org>, Michael Hausenblas <michael.hausenblas@deri.org>, RDB2RDF WG <public-rdb2rdf-wg@w3.org>
Message-Id: <80841560-7D1C-4506-8B60-2E5B98B2C5C7@cyganiak.de>
Juan,

On 6 Sep 2010, at 22:57, Juan Sequeda wrote:
>> Let's see if I understand the implied mechanics. Option 1 directly
>> specifies the RDF graph implied by a database (for any tuple in the
>> database, you can say exactly what triples are in the direct
>> graph). Option 2 specifies a mapping language, with certain mapping
>> semantics, and with a default configuration. The default graph is the
>> products of applying the mapping semantics for a default  
>> configuration
>> to a database.
>>
> Option 2 uses R2RML.
>
> I see the two options this way
>
> Option 1:
>
> 1) We (the WG) present the direct mapping rules in order to generate a
> direct RDF graph from a RDB
> 2) Database vendors (oracle, db2, etc) implement these mapping rules  
> OR
> RDB2RDF systems on top of a RDB can read the database dictionary and  
> run
> these mapping rules
> 3) You click the button "Generate Direct RDF"
> 4) Outcomes your RDF
> 5) Use RDF to RDF tools (sparql constructs, etc) to map to other
> vocabularies
>
> Option 2:
>
> 1) We (the WG) present the direct mapping rules in order to generate a
> direct RDF graph from a RDB
> 2) Database vendors (oracle, db2, etc) implement these mapping rules  
> OR
> RDB2RDF systems on top of a RDB can read the database dictionary and  
> run
> these mapping rules
> 3) You click the button "Generate Direct RDF"
> 4) Outcomes your RDF
> 5) Out comes the R2RML mapping file that generated the Direct RDF  
> Graph
> 6) A user can modify the R2RML mapping file in order to change  
> vocabularies,
> etc

I think it would be helpful if you kept stronger awareness of the  
difference between *implementation* and *specification*.

No matter wether our specification describes the direct mapping as a  
mapping to a “direct RDF graph” or to a “direct R2RML file”, vendors  
are free to, and able to, implement either of the two options above.  
In fact, the D2RQ system implements both of the approaches you  
describe above -- users can instruct it to dump the direct graph, or  
they can instruct it to create a D2RQ mapping file that can then be  
customized in order to dump a customized graph.

> So.. if we agree on this.. we are practically then talking about the  
> same
> thing. Only difference is that in Option 2 we are outputing the direct
> mapping also in R2RML. Otherwise.. why would we need R2RML??????

Remember that some people in the WG want what we sometimes call the  
“RDF-based approach”. They want to specify RDB2RDF mappings using,  
say, RIF rules against the direct graph.

So there are at least TWO DISTINCT AUDIENCES for the direct mapping  
spec:

1. RDB2RDF vendors who implement R2RML engines and want to equip their  
systems with functionality similar to D2R's "generate-mapping" script,  
which generates a simple canonical R2RML file for a given database,  
with the intent of allowing further customization of the R2RML file by  
the user.

2. RDB2RDF vendors who implement RIF-based engines (or engines based  
on any other RDF-to-RDF transformation language). Users of these  
engines will write RIF rules that transform the direct graph into a  
custom graph. Users and vendors of these systems don't need the R2RML  
language.

Eric's argument (which I find compelling) is this: People in the  
second category are much better served by a specification that  
describes the shape of the direct graph. That they are poorly served  
by a specification that describes the shape of a direct R2RML file,  
which then under the semantics of the R2RML language implies the shape  
of the direct graph.

(Possibly there is a third audience, vendors of engines that *only*  
provide access to the direct graph without offering the possibility of  
customization. They don't need R2RML either, they just need a spec  
that describes the shape of the direct graph given a DB.)

I hope this clarifies things.

Best,
Richard



>
>
>>
>>> So you think that a direct mapping shouldn't output the R2RML  
>>> file? I
>> think
>>> it should because this file is the basis for people to work on and  
>>> start
>>> customizing it.
>>
>> The RDF rules folks will have everything they need with option 1.  
>> They
>> can write/share rules in RIF, SPIN, n3, ... which transform the
>> default graph to popular ontologies. Simple implementations will
>> materialize these graphs, and arguably cooler implementations will
>> work directly on the relational data, but that's really  
>> implementation
>> detail; all they need is the default graph.
>>
>>
>>>> Hence I'm with Eric here.
>>>>
>>>>
>>>> The automatic mapping file that is generated in D2R is equivalent  
>>>> to
>> the
>>>>> Direct Mapping (right Richard?).
>>>>>
>>>>
>>>> Well I'd say the *graph* produced by an auto-generated D2R  
>>>> mapping file
>> is
>>>> equivalent to the direct mapping.
>>>>
>>>
>>> and I'd call the auto-generated D2R mapping file the Direct  
>>> Mapping file.
>> So
>>> D2R does option 2 then.
>>>
>>>>
>>>> Best,
>>>> Richard
>>
>> --
>> -ericP
>>
Received on Tuesday, 7 September 2010 03:32:22 UTC