Re: many to many tables in the direct mapping

On 11/3/2010 4:57 PM, Juan Sequeda wrote:
> mmmm... true.
>
> But there is a default way to do it.  I'm considering the following
> use-case:
>
> 1) I know my schema and I know that I have many-to-many tables.
> 2) I want to expose my relational data as RDF with the default mapping.
> 3) I'll run the default mapping with the many-to-many option enable
> 4) viola! I pushed a button and thanks to the default mapping, my whole
> rdb is in rdf
>
> If I did not include many-to-many option in the default mapping, then I
> would have to run the default mapping, get a R2RML file, customize it
> and then run the system again.
>
> I don't see any harm of presenting the default many-to-many mapping and
> let everybody know that it is an optional thing.

Parametrizing the default mapping adds complexity to what is otherwise a 
clear, well-defined starting point for further r2rml customization. 
Which r2rml capabilities then become algorithmic parameters on the 
default mapping? I don't think that's a point of complexity that's worth 
pursuing. Tools can (and hopefully will) provide this sort of global 
option which will make it easy to auto-interpret tables with 2 foreign 
keys as many-to-many mappings (and express it in r2rml).

Lee

>
> Juan Sequeda
> +1-575-SEQ-UEDA
> www.juansequeda.com <http://www.juansequeda.com>
>
>
> On Wed, Nov 3, 2010 at 3:51 PM, Eric Prud'hommeaux <eric@w3.org
> <mailto:eric@w3.org>> wrote:
>
>     * Juan Sequeda <juanfederico@gmail.com
>     <mailto:juanfederico@gmail.com>> [2010-11-03 15:47-0500]
>      > On Wed, Nov 3, 2010 at 3:43 PM, Eric Prud'hommeaux <eric@w3.org
>     <mailto:eric@w3.org>> wrote:
>      >
>      > > Cc:+= Michael Stonebraker <stonebraker@csail.mit.edu
>     <mailto:stonebraker@csail.mit.edu>>
>      > >
>      > > * Richard Cyganiak <richard@cyganiak.de
>     <mailto:richard@cyganiak.de>> [2010-11-02 12:39+0000]
>      > > > Eric,
>      > > >
>      > > > On 2 Nov 2010, at 08:56, Eric Prud'hommeaux wrote:
>      > > > > The monotonic addition of columns to the database results in
>      > > > > non-monotonic changes to the direct graph, breaking
>     existing queries
>      > > > > and mapping rules.
>      > > >
>      > > > I don't find this reason compelling.
>      > > >
>      > > > Changes to the source database break queries and mapping
>     rules. That
>      > > > is sort of obvious. It will not come as a surprise to users
>     either
>      > > > -- changing your database messes up your SQL queries and
>     views too.
>      > > >
>      > > > Removing or renaming a column will break things no matter
>     what. Same
>      > > > for adding or modifying primary or foreign keys. If adding a
>     column
>      > > > to a many-to-many table breaks queries or mapping rules too,
>     then so
>      > > > what? I don't see what's so special about that operation.
>      > > >
>      > > > As a matter of principle, I think this WG should not
>     inconvenience a
>      > > > large number of users (everyone who has many-to-many joins in
>     their
>      > > > schema) in order to maintain some notion of theoretical purity
>      > > > (monotonicity).
>      > >
>      > > It's not some theoretical goal that motivates me; "monotonic" just
>      > > happens to aptly describe the set of changes I can make to a
>      > > relational structure and not have to revisit every piece of
>     code which
>      > > queries that structure.
>      > >
>      > > I chatted with Mike Stonebraker about this and he had similar
>      > > reservations about having a different direct mapping for tables
>     whose
>      > > attributes happen to be covered by exactly two primary keys.
>     While his
>      > > concearns were more about complexity, he did offer the counter
>     example
>      > > for the many-to-many detection scheme: A marriage table may
>     well have
>      > > exactly two spouses in it, but it's not (in most places) a
>     many-to-many
>      > > situation. I think that the many-to-many-ness must be opt-in,
>     and not
>      > > in the direct/default mapping.
>      > >
>      > >
>      > Exactly!! the many-to-many is optional!
>      >
>      > Step 1: Create Literal Triples
>      > Step 2: Create Reference Triples
>      >
>      > Optional (the user will select if they want this option or not
>     because the
>      > user knows the schema)
>      > Step 3: Create Triples from Many-to-Many relations
>      >
>      > Even though it is optional, it should still be in the Default Mapping
>      > document.
>      >
>      > What do you think?
>
>     I thought that was the point of r2rml; the place where the user got
>     control of the interface graph.
>
>      > >
>      > > > > It will be harder for folks to write papers and innovate
>     soundly
>      > > > > with a more complex model.
>      > > >
>      > > > The interests of users should outweigh the interests of folks who
>      > > > write papers too.
>      > > >
>      > > > Best,
>      > > > Richard
>      > > >
>      > > >
>      > > >
>      > > >
>      > > > >
>      > > > >¹ http://www.w3.org/2001/sw/rdb2rdf/directGraph/alt#id0xa4a2a060
>      > > > >² http://www.w3.org/2001/sw/rdb2rdf/directGraph/#rules
>      > > > >
>      > > > >For example, consider a PersonAddress table which connects a
>     Person to
>      > > > >an Address:
>      > > > >
>      > > > >┌┤Person├────┐  ┌┤Address├───────┐  ┌┤PersonAddress├───┐
>      > > > >│ ID │ fname │  │ ID │ city      │  │ person │ address │
>      > > > >│  7 │ Bob   │  │ 18 │ Cambridge │  │      7 │      18 │
>      > > > >│  8 │ Sue   │  │ 19 │ Austin    │  │      7 │      19 │
>      > > > >└────┴───────┘  └────┴───────────┘  │      8 │      19 │
>      > > > >                                    └────────┴─────────┘
>      > > > >We can generate a direct graph for PersonAddress
>      > > > >@base <http://db.example/ContactDB/> .
>      > > > >
>      > > > ><PersonAddress/person.7_address.18#_>
>      > > > > <PersonAddress#person> <Person/ID.7#_> ;
>      > > > > <PersonAddress#address> <Address/ID.18#_> .
>      > > > ><PersonAddress/person.7_address.19#_>
>      > > > > <PersonAddress#person> <Person/ID.7#_> ;
>      > > > > <PersonAddress#address> <Address/ID.19#_> .
>      > > > ><PersonAddress/person.8_address.19#_>
>      > > > > <PersonAddress#person> <Person/ID.8#_> ;
>      > > > > <PersonAddress#address> <Address/ID.19#_> .
>      > > > >
>      > > > >OR, as I believe you propose, we can generate repeated
>     properties:
>      > > > >
>      > > > ><Person/ID.7#_>
>      > > > > <PersonAddress/person_address> <Address/ID.18#_> ;
>      > > > > <PersonAddress/person_address> <Address/ID.19#_> .
>      > > > ><Person/ID.8#_>
>      > > > > <PersonAddress/person_address> <Address/ID.19#_> .
>      > > > >
>      > > > >This one is attractively more terse, but, the addition of a
>     column to
>      > > > >the database:
>      > > > >┌┤PersonAddress├───┬─────────┐
>      > > > >│ person │ address │ primary │
>      > > > >│      7 │      18 │ true    │
>      > > > >│      7 │      19 │ false   │
>      > > > >│      8 │      19 │ true    │
>      > > > >└────────┴─────────┴─────────┘
>      > > > >
>      > > > >*retracts* those repeated properties and generates instead a
>     direct
>      > > > >graph for PersonAddress with the three additional primary
>     predicates:
>      > > > >
>      > > > ><PersonAddress/person.7_address.18#_>
>      > > > > <PersonAddress#person> <Person/ID.7#_> ;
>      > > > > <PersonAddress#address> <Address/ID.18#_> ;
>      > > > > <PersonAddress#primary> "true"^^xsd:boolean .
>      > > > ># + 7_19 and 8_19
>      > > > >
>      > > > >These retractions break queries and change the interface
>     graph even
>      > > > >though the addition of the column does not change the
>     interpretaion
>      > > > >of any of the other columns in the database.
>      > > > >--
>      > > > >-ericP
>      > > > >
>      > > >
>      > >
>      > > --
>      > > -ericP
>      > >
>
>     --
>     -ericP
>
>

Received on Thursday, 4 November 2010 15:36:04 UTC