Proposal for ISSUE-65 from Juan Sequeda on 2011-08-23 (public-rdb2rdf-wg@w3.org from August 2011)

From: Juan Sequeda <juanfederico@gmail.com>
Date: Tue, 23 Aug 2011 17:53:27 -0500
To: public-rdb2rdf-wg@w3.org
Message-ID: <CAMVTWDyqOJAwPFKCOeQrsL8_G=QXrdCn03XYGzUCx45036X=rQ@mail.gmail.com>

Consider the following database where People(addr) is a FK to City(id)

--|People|----------------------
|  ID     | name  |  addr  |
-----------------------------------
|  7      | "Bob"  |  18     |
-----------------------------------

--|City|-------------------------
|  ID     | city               |
-----------------------------------
|  18     | "Cambridge"  |
-----------------------------------

Proposal 1: Keep the DM as-is which means we would not address ISSUE-65.

We would only be generating one triple:

<People/ID=7><People#addr><City/ID=18> .

Proposal 2: For a foreign key column, generate a literal and reference
triple. The property IRI for each triple will have a special string
which will identify it as a literal or reference property IRI.

We would be generating two triples

<People/ID=7><People#Laddr> 18 .
<People/ID=7><People#Raddr><City/ID=18> .

Note that the triples will be of the form:

<subject IRI><table#Lcolumn> value .
<subject IRI><table#Rcolumn> <object IRI>

where L stands for Literal and R stands for Reference.

This is similar to Richard's suggestion <ref/Table#column>. The
drawback of that proposal was that we would need two different
namespaces. With Proposal 2, we don't! This should make everybody
happy.

HOWEVER, honestly, this in a way can be seen as a hack. We would be
sticking the semantics inside the IRI which is really weird.
Nevertheless, it works.

I would still like to hear more use-cases and motivations to why we
should generate a literal triple for foreign key columns. From Souri's
initial email, I have:

- Uniformity: For multi-column foreign keys we are already creating
literal triples, so why not keep it uniform and do it for unary-column
foreign keys.
- Performances: introduces need for unnecessary join with the parent
table to retrieve the value of the foreign key column.

Personally, uniformity is something nice to have. Performance-wise, if
we didn't generate the literal triple, we could still realize if the
property IRI is for a foreign key because the object is an IRI, so it
would trivial to retrieve the value of the foreign key column ..
through the database but maybe not from the RDF. Well, you could
actually parse it out of the IRI itself. So this motivation actually
doesn't convince me that much. Hence, I would like to hear more
use-cases.

In conclusion,

Proposal 2 address ISSUE-65 where we can still keep one simple prefix
per table, however we have the *weird* feature of sticking the type of
property inside the IRI

Proposal 1 does not address ISSUE-65

Looking forward to your comments


Juan Sequeda
+1-575-SEQ-UEDA
www.juansequeda.com

Received on Tuesday, 23 August 2011 22:54:22 UTC