- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Tue, 16 Aug 2011 14:34:50 +0200
- To: Juan Sequeda <juanfederico@gmail.com>
- Cc: public-rdb2rdf-wg@w3.org
* Juan Sequeda <juanfederico@gmail.com> [2011-08-10 11:49-0500]
> Below is a conversation I started with Eric which involves ISSUE-64 and
> ISSUE-65.
>
> Basically there are 3 options
>
> 1) Ignore ISSUE-64 and ISSUE-65
> 2) Address ISSUE-65 and ignore ISSUE-64
> 3) Address both issues.
>
> Each of these options have advantages/disadvantages. Eric is added more
> comments on this.
>
> David, Souri,
>
> can you give me a real use-case where there is a need of multiple foreign
> keys from the same columns.
>
>
> At this moment, I'm leaning towards Option 2. Eric is leaning towards Option
> 1.
I believe Juan, Marcelo and myself now all endorse 1:
PROPOSE to close ISSUE-64 noting that the current DM definition generates triples for all foreign keys even if they are on the same columns.
PROPOSE to close ISSUE-65 noting that attempting to unify the treatment of literal triples over unary foreign keys marginally complicates the definition <http://localhost/2001/sw/rdb2rdf/directMapping/explicitFK#definition> and either breaks the clustering of table predicates in a single namespace or introduces ','s into localnames, which are difficult to represent in SPARQL and Turtle.
> Looking forward to this discussion to see if we can resolve this quickly.
>
> With this, I guess my ACTION-152 is closed.
>
> Juan Sequeda
> +1-575-SEQ-UEDA
> www.juansequeda.com
>
>
> ---------- Forwarded message ----------
> From: Eric Prud'hommeaux <eric@w3.org>
> Date: Wed, Aug 10, 2011 at 6:10 AM
> Subject: Re: Our different options
> To: Juan Sequeda <juanfederico@gmail.com>
>
>
> whoops, sorry, fell asleep before checking mail again.
>
> * Juan Sequeda <juanfederico@gmail.com> [2011-08-09 18:30-0500]
> > Eric,
> >
> > What do you think about this:
> >
> >
> > Consider the following database
> >
> > Person(pid, name, addr)
> > Address(aid, title)
> >
> > where addr of Person is a FK to aid of Address
> >
> > Person(1, John, 2)
> > Address(2, Cambridge)
>
> I like to see stuff as tables (helps me visualize):
> ┌┤Person├─────┬──────┐ ┌┤Address├──────────┐
> │ id │ name │ addr │ │ aid │ title │
> │ 1 │ "John" │ 2 │ │ 2 │ "Cambridge" │
> └────┴────────┴──────┘ └─────┴─────────────┘
>
> though I think we can use the example from the current spec which will
> help later in the conversation because we can speak of the concepts
> and specific spec changes to the spec in the same breath:
>
> People(7, Bob, 18)
> Addresses(18, Cambridge)
>
> ┌┤People├─────┬──────┐ ┌┤Addresses├───────┐
> │ ID │ fname │ addr │ │ ID │ city │
> │ 7 │ "Bob" │ 18 │ │ 18 │ "Cambridge" │
> └────┴────────┴──────┘ └────┴─────────────┘
>
>
> > Option 1:
> >
> > Do not address ISSUE-64 or ISSUE-65.
> >
> > Advantage:
> >
> > - Keeping the DM very simple
> > - The IRI for all predicates will be very simple:
> <tableName#AttributeName>
> > - IRIs are *nice*, except for foreign key IRIs which are:
>
> except for n-ary foreign key IRIs | n>1, which require
> ','s:
>
> > <tableName#AttributeName1,AttributeName2,...>
> >
> > Disadvantage:
> > - Not addressing ISSUE-64 and ISSUE-65
> >
> >
> > The triples are the following:
> >
> > <People/ID=7> <People#fname> "BoB" .
> > <People/ID=7> <People#addr> <Addresses/ID=18>
> > <Addresses/ID=18> <Addresses#city> Cambridge
> >
> >
> > Option 2:
> >
> > Address ISSUE-65 but not ISSUE-64
> >
> > Advantage
> > - Avoid doing a join in order to get a the value of the foreign key
> > attribute
> > - All IRIs *nice*
> > - If a foreign key is multi-column, then we would have a *nice*
> > IRI <People#Department> instead of an *ugly* IRI
> <People#deptName,deptCity>
> > (having all the columns in the foreign key in the IRI separated by commas)
> >
> > Disadvantage
> > - Need to create two different IRIs for predicates: literal and reference
>
> - Ambiguous if there's more than one foreign key to the same table, e.g.
>
> ┌┤People├─────┬──────────┬──────────┐ ┌┤Addresses├───────┐
> │ ID │ fname │ homeaddr │ workaddr │ │ ID │ city │
> │ 7 │ "Bob" │ 18 │ 18 │ │ 18 │ "Cambridge" │
> └────┴────────┴──────────┴──────────┘ │ 23 │ "Arlington" │
> └────┴─────────────┘
> where (homeaddr) → (Addresses, (ID))
> (workaddr) → (Addresses, (ID))
>
> (can also be exemplified in one table, but it's arguably more awkward:
> ┌┤People├─────┬──────┬───────────────┐
> │ ID │ fname │ boss │ officeManager │
> │ 1 │ "Amy" │ 8 │ 13 │
> │ 7 │ "Bob" │ 8 │ 13 │
> │ 8 │ "Sue" │ 1 │ 13 │
> │ 13 │ "Tom" │ 1 │ 13 │
> └────┴────────┴──────┴───────────────┘
> where (boss) → (People, (ID))
> (officeManager) → (People, (ID))
> )
>
> I believe that there are way more cases where a table has more than
> one foreign key to the same table than that a table has and needs more
> than one foreign key constrain on the same columns. In databases I've
> touched in the last week, protein-protein interaction tables come to mind.
>
>
> > predicate IRIs
> > - Not as simple anymore, but still pretty simple
> >
> > The two predicate IRIs are:
> >
> > literal predicate IRI: <tableName#attributename>
> > reference predicate IRI: <tableName#referenceTableName>
> >
> > The triples are the following:
> >
> > <People/ID=7> <People#fname> "BoB" .
> > <People/ID=7> <People#addr> 2
> > <People/ID=7> <People#Addresses> <Addresses/ID=18>
> > <Addresses/ID=18> <Addresses#city> Cambridge
> >
> > Option 3:
> >
> > Address ISSUE-64 and ISSUE-65
> >
> > Advantage
> > - Avoid doing a join in order to get a the value of the foreign key
> > attribute
> > - Address the following use case: same column sequence may be used for
> > multiple foreign key constraints
> >
> > Disadvantage
> > - Need to create two different IRIs for predicates: literal and reference
> > predicate IRIs
> > - reference predicate IRIs are complicated and ugly:
> >
> > <People,Department#deptName,name;deptCity,city>
> > or maybe
> > <People#Department;deptName,name;deptCity,city>
> >
> > -----
> >
> > The issues of having these ugly IRIs are in prefixes for sparql queries.
> > With option 2, I could have a prefix
> >
> > PREFIX ex: <http://www.example.com/vocab/People#>
> >
> > SELECT *
> > WHERE{
> > ?s ex:Addresses ?o
> > }
> >
> > With option 1 or 3, I would need to have the entire IRI in the query
> >
> > ?s <http://www.example.com/vocab/People#deptName,deptCity> ?o
> >
> > or
> >
> > ?s <
> > http://www.example.com/vocab/People#Department;deptName,name;deptCity,city
> >
> > ?o
> >
> >
> > Eric... what do you think about this? I'm leaning towards option 2
>
> Very nice summary.
>
> I'm still leaning heavily towards 1. I think that the current
> situation isn't bad when you have more than one foreign key on a
> column list. Given an access control scenario:
>
> CREATE TABLE Principles (ID INT PRIMARY KEY, created STRING);
> INSERT INTO Principles (ID, created) VALUES (2, "2011-09-10");
> INSERT INTO Principles (ID, created) VALUES (3, "2011-09-10");
> CREATE TABLE Users (ID INT PRIMARY KEY, name STRING, FOREIGN KEY (ID)
> REFERENCES Principles(ID));
> INSERT INTO Users (ID, name) VALUES (2, "Bob");
> CREATE TABLE IPAddrs (ID INT PRIMARY KEY, ip STRING, FOREIGN KEY (ID)
> REFERENCES Principles(ID));
> INSERT INTO IPAddrs (ID, ip) VALUES (3, "81.23.2.200");
> CREATE TABLE Roles (ID INT PRIMARY KEY, permissions STRING, FOREIGN KEY (ID)
> REFERENCES Users(ID), FOREIGN KEY (ID) REFERENCES Principles(ID));
> INSERT INTO Roles (ID, permissions) VALUES (2, "rwx");
>
> ┌┤Principles├─────┐
> │ ID │ created │
> │ 2 │ 2011-09-10 │
> │ 3 │ 2011-09-10 │
> └────┴────────────┘
> / \
> ┌┤Users├─────┐ ┌┤IPAddrs├─────────┐
> │ ID │ name │ │ ID │ ip │
> │ 2 │ "Bob" │ │ 3 │ 18.23.2.200 │
> └────┴───────┘ └────┴─────────────┘
>
> ┌┤Roles├─────────────┐
> │ user │ permissions │
> │ 2 │ "rwx" │
> └──────┴─────────────┘
>
> Roles could be argued to be a foreign key to both Users and Principles
> (though presumably, Users.ID already has a foreign key constraint on
> Principles.ID so (Roles.user) → (Principles (ID)) is redundant). At
> present, the DM gives you multiple arcs for the foreign key name (ID):
>
> <Roles/ID.2> a <Roles> ;
> <Roles#ID> <Users/ID.2> , <Principles/ID.2> ;
> <Roles#permissions> "rwx" .
>
> which is just about what you're telling the system with your two
> foreign keys. BTW, you can go to
> <http://this-db-really.does-not-exist.org/>
> and enter the above DDL and an identity CONSTRUCT:
>
> CONSTRUCT {
> ?s ?p ?o .
> } WHERE {
> ?s ?p ?o .
> }
>
> to see this in action.
>
> As to having to do a join to get the values, I don't think it's worth
> the added user burden to optimize scalar access to foreign key values.
>
> I've rolled the changes into a doc called
> http://www.w3.org/2001/sw/rdb2rdf/directMapping/explicitFK
> and reverted EGP modulo
>
> Feel free to forward this to Marcelo, rdb2rdf-wg, the IRS or the
> selective service.
>
>
> > Juan Sequeda
> > +1-575-SEQ-UEDA
> > www.juansequeda.com
>
> --
> -ericP
--
-ericP
Received on Tuesday, 16 August 2011 12:35:13 UTC