- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Sun, 13 May 2012 13:20:25 -0400
- To: David McNeil <dmcneil@revelytix.com>
- Cc: public-rdb2rdf-wg@w3.org
* David McNeil <dmcneil@revelytix.com> [2012-05-13 09:15-0500]
> On Sat, May 12, 2012 at 11:33 PM, Eric Prud'hommeaux <eric@w3.org> wrote:
>
> > I wrote up the issue and use cases for R2RML's handling of non-unique
> > tables:
> > <http://www.w3.org/2001/sw/rdb2rdf/wiki/Non-unique_Tables>
> >
>
> Eric - Thanks for writing this up. It is helpful.
>
> I have a question for clarification. As a matter of fact, this statement
> does not seem accurate to me:
>
> "This makes it impossible for one to generate an R2RML mapping to the DM of
> a table with potentially repeated rows."
>
> An R2RML feature not mentioned in the writeup is the ability to define a
> logical view in an R2RML mapping. Such a view can contain a SQL statement
> (and R2RML has been defined with the idea of using database specific SQL in
> such cases). I believe this capability means that we cannot say it is
> "impossible" for an R2RML mapping to capture the DM of a table with
> repeated rows. We can debate about whether it is practical to write or if
> the runtime costs are too high, etc. but I don't think we can say it is
> impossible.
Good point, this could solve the problem.
Can you think of a way to create a view on:
┌┤IOUs├─┬───────┬────────┐
│ fname │ lname │ amount │
│ Bob │ Smith │ 30 │
│ Sue │ Jones │ 20 │
│ Bob │ Smith │ 30 │
which produces distinct rows like:
┌───┬┤IOUsʹ├┬───────┬────────┐
│ X │ fname │ lname │ amount │
│ 1 │ Bob │ Smith │ 30 │
│ 1 │ Sue │ Jones │ 20 │
│ 2 │ Bob │ Smith │ 30 │
(or any values in X which are different for the two Bob Smith 30
rows)? We could create a table with an auto_increment and populate it
with the old table (which would give us X={1,2,3}, but that implies a
change to the database. We could also join against an on-the-fly table
which contains all of the numbers up the the maximum cardinality:
SELECT nums.n,IOUs.fname,IOUs.lname,IOUs.amount
FROM (SELECT fname, lname, amount, count(*) C
FROM IOUs
GROUP BY fname, lname, amount) AS IOUs
INNER JOIN
(SELECT 1 n UNION
SELECT 2 n UNION
SELECT 3 n UNION
SELECT 4 n) AS nums
ON nums.n <= IOUs.C;
┌───┬───────┬───────┬────────┐
│ n │ fname │ lname │ amount │
│---│-------│-------│--------│
│ 1 │ Bob │ Smith │ 30 │
│ 1 │ Sue │ Jones │ 20 │
│ 2 │ Bob │ Smith │ 30 │
but that requires computing a query knowing the max cardinality.
Any idea where to go from here?
--
-ericP
Received on Sunday, 13 May 2012 17:20:58 UTC