Re: Defining a SQL fragment? from ashok malhotra on 2010-07-22 (public-rdb2rdf-wg@w3.org from July 2010)

From: ashok malhotra <ashok.malhotra@oracle.com>
Date: Thu, 22 Jul 2010 06:06:28 -0700
To: Harry Halpin <hhalpin@w3.org>
CC: Richard Cyganiak <richard@cyganiak.de>, Marcelo Arenas <marcelo.arenas1@gmail.com>, public-rdb2rdf-wg@w3.org
Message-ID: <4C484254.4000606@oracle.com>
You have to be able to customize the mapping.  Just mapping to a default 
graph is not enough.
The customization has to include, besides renaming, functions on data 
values, aggregates, etc.
All the best, Ashok


Harry Halpin wrote:
>> Harry,
>>
>> Thanks for the clarification.
>>
>> On 21 Jul 2010, at 17:42, Harry Halpin wrote:
>>     
>>> I think for ETL purposes language
>>> could have 4 parts. Each except 3) is optional.
>>>
>>> 1) Full vendor specific SQL to create a view
>>>
>>> 2) A portable subset of SQL to create a view
>>>
>>> 3) Mapping of that view to a default graph
>>>
>>> 4) Possibly running RDF-to-RDF transforms here (RIF).
>>>       
>> Where in these 4 do I say that USER.NAME should be mapped to foaf:name
>> rather than mydb:USER.NAME?
>>     
>
> It seems there are some differences here in the group, but I think 3)
> would be the right place, i.e. the mapping from SQL to the graph, which
> seems often to come after creating some kind of view in SQL  - as done
> with full SQL power (1) or Datalog in (2). EricP seems to want to use RIF
> to modify that (4).
>
> That's why I'm tempted to say, let's work on just 3) and assume they'll be
> a place for 1) and then optionally leave 3) and 4) behind for now.
>
> What do you think?
>
>   
>>> I think Marcelo and Juan were wondering if steps 2-4 had a common core
>>> that could be thought of semantically as Datalog.
>>>
>>> But if people choose 1) then they just have to know that R2ML will not
>>> guarantee portability.
>>>       
>> Ok, but the differences between SQL dialects are mostly about syntax
>> and hardly about semantics; so I'm still unsure how Datalog helps with
>> SQL portability.
>>     
>
> That is an issue though - I mean, if the standard just has a bunch of
> vendor-specific SQL between curly brackets, then we may not be portable.
>
> I'll let Marcelo and Juan argue for Datalog, but the idea was that there
> might be a simple subset of SQL we can guarantee to be portable. Ashok has
> brought up another well-known vendor defined set of SQL.
>
>
>
> However, that does not mean we should restrict people to use that subset.
> For some people, portability may not be a concern. I'm OK with using
> anything to transform relational data to the graph as long as implementers
> actually will implement it and users will use it (this does bring up
> concerns about any non SQL-based approach), as long as we can guarantee at
> least subset of it's portability and then if something may not be portable
> allow it to be clearly defined as such.
>
>   
>>> What this does not bring up is what eric and soeren were really
>>> wanting to
>>> do earlier as well, which was SPARQL->SQL mappings.
>>>       
>> Are you saying that we need separate languages for ETL access and for
>> SPARQL access to the mapped database? I don't think so; it's the same
>> language. R2ML should specify how to derive an RDF graph from a
>> relational DB. How to access that RDF graph (linked data, SPARQL, ETL,
>> brainwave transmission) is up to implementations.
>>     
>
> I would hope we do not need a separate language for that, but there needs
> to be a clear statement about that in the spec.
>
>   
>> Best,
>> Richard
>>
>>
>>
>>
>>     
>>> However, before descending into the black hole of semantics and
>>> options,
>>> Im'm happy to agree to get a rough-draft out on 1) and 3) if people
>>> can't
>>> agreee on 2) and 4).
>>>
>>>       
>>>> I think there is a clear desire to allow full SQL in a compliant
>>>> implementation of the SQL-based approach. This is at least what I
>>>> gather from Souri's and Orri's comments. I can not remember anyone
>>>> making an argument that only a restricted SQL fragment should be
>>>> allowed in the SQL-based approach.
>>>>
>>>> Can you please explain, or point me to the discussion that motivates
>>>> the need for restrictions in the allowable SQL in the SQL-based
>>>> approach?
>>>>
>>>> Best,
>>>> Richard
>>>>
>>>>
>>>>         
>>     
>
>
>
Received on Thursday, 22 July 2010 13:08:31 UTC