Re: Addressing ISSUE-64 and ISSUE-65 from Juan Sequeda on 2011-08-10 (public-rdb2rdf-wg@w3.org from August 2011)

From: Juan Sequeda <juanfederico@gmail.com>
Date: Wed, 10 Aug 2011 13:24:58 -0500
To: Richard Cyganiak <richard@cyganiak.de>
Cc: public-rdb2rdf-wg@w3.org
Message-ID: <CAMVTWDybGGahUGH98apFyeCWuaznhJVa2E6ThUfPk5wqnLzc2w@mail.gmail.com>
Richard

So true! I totally forgot about this option.

Souri mentioned it quickly on the call because that is how they are doing it
in Oracle.

Souri, all: what are the cons of having the constraint name in the IRI?

Juan Sequeda
+1-575-SEQ-UEDA
www.juansequeda.com


On Wed, Aug 10, 2011 at 1:22 PM, Richard Cyganiak <richard@cyganiak.de>wrote:

> Both ISSUE-64 and ISSUE-65 could be solved by creating the property URI for
> the foreign key triple in a different way.
>
> The current property URI is:
> http://base_uri/tablename#column1,column2
>
> Souri's proposal is:
> http://base_uri/tablename#column1,column2,referenced_table,column3,column4
>
> Here's another proposal:
> http://base_uri/constraint_name
>
> All constraints in SQL are named. These names are implementation-dependent
> and often assigned automatically without the user's knowledge, so they may
> not make very nice properties. But on the other hand, this approach ensures
> uniqueness in a simple way, and it *is* actually the most direct and literal
> mapping from database to RDF, so it seems to be in the spirit of the direct
> mapping.
>
> Best,
> Richard
>
>
> On 10 Aug 2011, at 17:49, Juan Sequeda wrote:
>
> > Below is a conversation I started with Eric which involves ISSUE-64 and
> ISSUE-65.
> >
> > Basically there are 3 options
> >
> > 1) Ignore ISSUE-64 and ISSUE-65
> > 2) Address ISSUE-65 and ignore ISSUE-64
> > 3) Address both issues.
> >
> > Each of these options have advantages/disadvantages. Eric is added more
> comments on this.
> >
> > David, Souri,
> >
> > can you give me a real use-case where there is a need of multiple foreign
> keys from the same columns.
> >
> >
> > At this moment, I'm leaning towards Option 2. Eric is leaning towards
> Option 1.
> >
> > Looking forward to this discussion to see if we can resolve this quickly.
> >
> > With this, I guess my ACTION-152 is closed.
> >
> > Juan Sequeda
> > +1-575-SEQ-UEDA
> > www.juansequeda.com
> >
> >
> > ---------- Forwarded message ----------
> > From: Eric Prud'hommeaux <eric@w3.org>
> > Date: Wed, Aug 10, 2011 at 6:10 AM
> > Subject: Re: Our different options
> > To: Juan Sequeda <juanfederico@gmail.com>
> >
> >
> > whoops, sorry, fell asleep before checking mail again.
> >
> > * Juan Sequeda <juanfederico@gmail.com> [2011-08-09 18:30-0500]
> > > Eric,
> > >
> > > What do you think about this:
> > >
> > >
> > > Consider the following database
> > >
> > > Person(pid, name, addr)
> > > Address(aid, title)
> > >
> > > where addr of Person is a FK to aid of Address
> > >
> > > Person(1, John, 2)
> > > Address(2, Cambridge)
> >
> > I like to see stuff as tables (helps me visualize):
> > ┌┤Person├─────┬──────┐  ┌┤Address├──────────┐
> > │ id │ name   │ addr │  │ aid │ title       │
> > │  1 │ "John" │    2 │  │   2 │ "Cambridge" │
> > └────┴────────┴──────┘  └─────┴─────────────┘
> >
> > though I think we can use the example from the current spec which will
> > help later in the conversation because we can speak of the concepts
> > and specific spec changes to the spec in the same breath:
> >
> > People(7, Bob, 18)
> > Addresses(18, Cambridge)
> >
> > ┌┤People├─────┬──────┐  ┌┤Addresses├───────┐
> > │ ID │ fname  │ addr │  │ ID │ city        │
> > │  7 │ "Bob"  │   18 │  │ 18 │ "Cambridge" │
> > └────┴────────┴──────┘  └────┴─────────────┘
> >
> >
> > > Option 1:
> > >
> > > Do not address ISSUE-64 or ISSUE-65.
> > >
> > > Advantage:
> > >
> > > - Keeping the DM very simple
> > > - The IRI for all predicates will be very simple:
> <tableName#AttributeName>
> > > - IRIs are *nice*, except for foreign key IRIs which are:
> >
> >                     except for n-ary foreign key IRIs | n>1, which
> require ','s:
> >
> > > <tableName#AttributeName1,AttributeName2,...>
> > >
> > > Disadvantage:
> > >  - Not addressing ISSUE-64 and ISSUE-65
> > >
> > >
> > > The triples are the following:
> > >
> > > <People/ID=7> <People#fname> "BoB" .
> > > <People/ID=7> <People#addr> <Addresses/ID=18>
> > > <Addresses/ID=18> <Addresses#city> Cambridge
> > >
> > >
> > > Option 2:
> > >
> > > Address ISSUE-65 but not ISSUE-64
> > >
> > > Advantage
> > > - Avoid doing a join in order to get a the value of the foreign key
> > > attribute
> > > - All IRIs *nice*
> > > - If a foreign key is multi-column, then we would have a *nice*
> > > IRI <People#Department> instead of an *ugly* IRI
> <People#deptName,deptCity>
> > > (having all the columns in the foreign key in the IRI separated by
> commas)
> > >
> > > Disadvantage
> > > - Need to create two different IRIs for predicates: literal and
> reference
> >
> >  - Ambiguous if there's more than one foreign key to the same table, e.g.
> >
> > ┌┤People├─────┬──────────┬──────────┐  ┌┤Addresses├───────┐
> > │ ID │ fname  │ homeaddr │ workaddr │  │ ID │ city        │
> > │  7 │ "Bob"  │       18 │       18 │  │ 18 │ "Cambridge" │
> > └────┴────────┴──────────┴──────────┘  │ 23 │ "Arlington" │
> >                                       └────┴─────────────┘
> >  where (homeaddr) → (Addresses, (ID))
> >       (workaddr) → (Addresses, (ID))
> >
> > (can also be exemplified in one table, but it's arguably more awkward:
> >  ┌┤People├─────┬──────┬───────────────┐
> >  │ ID │ fname  │ boss │ officeManager │
> >  │  1 │ "Amy"  │    8 │            13 │
> >  │  7 │ "Bob"  │    8 │            13 │
> >  │  8 │ "Sue"  │    1 │            13 │
> >  │ 13 │ "Tom"  │    1 │            13 │
> >  └────┴────────┴──────┴───────────────┘
> >  where (boss) → (People, (ID))
> >       (officeManager) → (People, (ID))
> > )
> >
> > I believe that there are way more cases where a table has more than
> > one foreign key to the same table than that a table has and needs more
> > than one foreign key constrain on the same columns. In databases I've
> > touched in the last week, protein-protein interaction tables come to
> mind.
> >
> >
> > > predicate IRIs
> > > - Not as simple anymore, but still pretty simple
> > >
> > > The two predicate IRIs are:
> > >
> > > literal predicate IRI: <tableName#attributename>
> > > reference predicate IRI: <tableName#referenceTableName>
> > >
> > > The triples are the following:
> > >
> > > <People/ID=7> <People#fname> "BoB" .
> > > <People/ID=7> <People#addr> 2
> > > <People/ID=7> <People#Addresses> <Addresses/ID=18>
> > > <Addresses/ID=18> <Addresses#city> Cambridge
> > >
> > > Option 3:
> > >
> > > Address ISSUE-64 and ISSUE-65
> > >
> > > Advantage
> > > - Avoid doing a join in order to get a the value of the foreign key
> > > attribute
> > > - Address the following use case: same column sequence may be used for
> > > multiple foreign key constraints
> > >
> > > Disadvantage
> > > - Need to create two different IRIs for predicates: literal and
> reference
> > > predicate IRIs
> > > - reference predicate IRIs are complicated and ugly:
> > >
> > > <People,Department#deptName,name;deptCity,city>
> > > or maybe
> > > <People#Department;deptName,name;deptCity,city>
> > >
> > > -----
> > >
> > > The issues of having these ugly IRIs are in prefixes for sparql
> queries.
> > > With option 2, I could have a prefix
> > >
> > > PREFIX ex: <http://www.example.com/vocab/People#>
> > >
> > > SELECT *
> > > WHERE{
> > > ?s ex:Addresses ?o
> > > }
> > >
> > > With option 1 or 3, I would need to have the entire IRI in the query
> > >
> > > ?s <http://www.example.com/vocab/People#deptName,deptCity> ?o
> > >
> > > or
> > >
> > > ?s <
> > >
> http://www.example.com/vocab/People#Department;deptName,name;deptCity,city
> >
> > > ?o
> > >
> > >
> > > Eric... what do you think about this? I'm leaning towards option 2
> >
> > Very nice summary.
> >
> > I'm still leaning heavily towards 1. I think that the current
> > situation isn't bad when you have more than one foreign key on a
> > column list. Given an access control scenario:
> >
> > CREATE TABLE Principles (ID INT PRIMARY KEY, created STRING);
> > INSERT INTO Principles (ID, created) VALUES (2, "2011-09-10");
> > INSERT INTO Principles (ID, created) VALUES (3, "2011-09-10");
> > CREATE TABLE Users (ID INT PRIMARY KEY, name STRING, FOREIGN KEY (ID)
> REFERENCES Principles(ID));
> > INSERT INTO Users (ID, name) VALUES (2, "Bob");
> > CREATE TABLE IPAddrs (ID INT PRIMARY KEY, ip STRING, FOREIGN KEY (ID)
> REFERENCES Principles(ID));
> > INSERT INTO IPAddrs (ID, ip) VALUES (3, "81.23.2.200");
> > CREATE TABLE Roles (ID INT PRIMARY KEY, permissions STRING, FOREIGN KEY
> (ID) REFERENCES Users(ID), FOREIGN KEY (ID) REFERENCES Principles(ID));
> > INSERT INTO Roles (ID, permissions) VALUES (2, "rwx");
> >
> >            ┌┤Principles├─────┐
> >            │ ID │ created    │
> >            │  2 │ 2011-09-10 │
> >            │  3 │ 2011-09-10 │
> >            └────┴────────────┘
> >            /    \
> > ┌┤Users├─────┐  ┌┤IPAddrs├─────────┐
> > │ ID │ name  │  │ ID │ ip          │
> > │  2 │ "Bob" │  │  3 │ 18.23.2.200 │
> > └────┴───────┘  └────┴─────────────┘
> >
> > ┌┤Roles├─────────────┐
> > │ user │ permissions │
> > │    2 │       "rwx" │
> > └──────┴─────────────┘
> >
> > Roles could be argued to be a foreign key to both Users and Principles
> > (though presumably, Users.ID already has a foreign key constraint on
> > Principles.ID so (Roles.user) → (Principles (ID)) is redundant). At
> > present, the DM gives you multiple arcs for the foreign key name (ID):
> >
> >  <Roles/ID.2> a <Roles> ;
> >        <Roles#ID> <Users/ID.2> , <Principles/ID.2> ;
> >        <Roles#permissions> "rwx" .
> >
> > which is just about what you're telling the system with your two
> > foreign keys. BTW, you can go to
> >  <http://this-db-really.does-not-exist.org/>
> > and enter the above DDL and an identity CONSTRUCT:
> >
> > CONSTRUCT {
> >  ?s ?p ?o .
> > } WHERE {
> >  ?s ?p ?o .
> > }
> >
> > to see this in action.
> >
> > As to having to do a join to get the values, I don't think it's worth
> > the added user burden to optimize scalar access to foreign key values.
> >
> > I've rolled the changes into a doc called
> >  http://www.w3.org/2001/sw/rdb2rdf/directMapping/explicitFK
> > and reverted EGP modulo
> >
> > Feel free to forward this to Marcelo, rdb2rdf-wg, the IRS or the
> > selective service.
> >
> >
> > > Juan Sequeda
> > > +1-575-SEQ-UEDA
> > > www.juansequeda.com
> >
> > --
> > -ericP
> >
>
>
Received on Wednesday, 10 August 2011 18:25:47 UTC