- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Tue, 2 Nov 2010 04:56:22 -0400
- To: Marcelo Arenas <marcelo.arenas1@gmail.com>, Juan Sequeda <juanfederico@gmail.com>
- Cc: public-rdb2rdf-wg@w3.org
Before you spend cycles on "2.3.3 The third step of the translation
process: Representing many-to-many tables"¹, I propose that the direct
graph not have a special case for them for the following reasons:
  The definition for the many-to-many table is more complex than other
  constructs in the direct mapping. I believe that definition would
  be: a table which has exactly two foreign keys which are composed of
  distinct sets of columns, and which has no columns which are not in
  one or both of those foreign keys.
  The same information is already captured in a different graph shape
  in the rules to generically map relations².
  The monotonic addition of columns to the database results in
  non-monotonic changes to the direct graph, breaking existing queries
  and mapping rules.
  It will be harder for folks to write papers and innovate soundly
  with a more complex model.
¹ http://www.w3.org/2001/sw/rdb2rdf/directGraph/alt#id0xa4a2a060
² http://www.w3.org/2001/sw/rdb2rdf/directGraph/#rules
For example, consider a PersonAddress table which connects a Person to
an Address:
┌┤Person├────┐  ┌┤Address├───────┐  ┌┤PersonAddress├───┐
│ ID │ fname │ │ ID │ city      │  │ person │ address │
│  7 │ Bob   │ │ 18 │ Cambridge │  │      7 │      18 │
│  8 │ Sue   │ │ 19 │ Austin    │  │      7 │      19 │
└────┴───────┘ └────┴───────────┘  │      8 │      19 │
        └────────┴─────────┘
We can generate a direct graph for PersonAddress
@base <http://db.example/ContactDB/> .
<PersonAddress/person.7_address.18#_>
    <PersonAddress#person> <Person/ID.7#_> ;
    <PersonAddress#address> <Address/ID.18#_> .
<PersonAddress/person.7_address.19#_>
    <PersonAddress#person> <Person/ID.7#_> ;
    <PersonAddress#address> <Address/ID.19#_> .
<PersonAddress/person.8_address.19#_>
    <PersonAddress#person> <Person/ID.8#_> ;
    <PersonAddress#address> <Address/ID.19#_> .
OR, as I believe you propose, we can generate repeated properties:
<Person/ID.7#_>
    <PersonAddress/person_address> <Address/ID.18#_> ;
    <PersonAddress/person_address> <Address/ID.19#_> .
<Person/ID.8#_>
    <PersonAddress/person_address> <Address/ID.19#_> .
This one is attractively more terse, but, the addition of a column to
the database:
                                    ┌┤PersonAddress├───┬─────────┐
        │ person │ address │ primary │
        │      7 │      18 │ true    │
        │      7 │      19 │ false   │
        │      8 │      19 │ true    │
        └────────┴─────────┴─────────┘
*retracts* those repeated properties and generates instead a direct
graph for PersonAddress with the three additional primary predicates:
<PersonAddress/person.7_address.18#_>
    <PersonAddress#person> <Person/ID.7#_> ;
    <PersonAddress#address> <Address/ID.18#_> ;
    <PersonAddress#primary> "true"^^xsd:boolean .
# + 7_19 and 8_19
These retractions break queries and change the interface graph even
though the addition of the column does not change the interpretaion
of any of the other columns in the database.
-- 
-ericP
Received on Tuesday, 2 November 2010 08:56:59 UTC