Re: Proposal for ISSUE-65 from Juan Sequeda on 2011-08-28 (public-rdb2rdf-wg@w3.org from August 2011)

From: Juan Sequeda <juanfederico@gmail.com>
Date: Sun, 28 Aug 2011 16:45:20 -0400
To: Richard Cyganiak <richard@cyganiak.de>
Cc: "Eric Prud'hommeaux" <eric@w3.org>, public-rdb2rdf-wg@w3.org
Message-ID: <CAMVTWDyJh+YFOTWua1S4y_HoFNgN6Gnxzagm-mNvGb7jeNmD4w@mail.gmail.com>
All,

After discussion with Eric, we came up with the following proposal:

[[
PROPOSAL: For a foreign key of any arity, the reference property IRI is of
the form <Table#ref-attr1-attr2-...-attrn>. This will address ISSUE-65
]]

Juan Sequeda
+1-575-SEQ-UEDA
www.juansequeda.com


On Sun, Aug 28, 2011 at 10:21 AM, Richard Cyganiak <richard@cyganiak.de>wrote:

> Hi Eric,
>
> On 27 Aug 2011, at 23:52, Eric Prud'hommeaux wrote:
> >> The case for Uniformity is stronger than that: All columns, always, are
> mapped in the same predictable way; with the single exception of unary
> foreign keys.
> >
> > Given that all values are currently extractable, I think this needs to be
> cast as a use case that by includes a user and a learning curve:
> >
> >  User Joe has understands the pattern of querying database columns,
> doesn't know the unary foreign key exception.
> […]
>
> I'm more concerned about the case where Joe builds his application, and
> then Bob the DBA, who knows nothing of Joe's work, makes an implicit foreign
> key constraint explicit, and Joe's application breaks.
>
> I'm more concerned about the case where Joe uses some off-the-shelf
> front-end on top of the DM (such as Pubby or Linked Data Pages or some
> generic Linked Data browser), and that front-end just shows the directly
> attached properties when displaying a resource, and therefore the LOINC code
> of a disease outcome is not shown.
>
> I'm more concerned about the case where Joe *thinks* that performance
> suffers because his SPARQL query requires an extra triple pattern to access
> the LOINC code, which in his mind translate to an extra join (whatever the
> implementation *actually* does under the hood).
>
> I'm more concerned about Joe adding a dummy second column to his FKs just
> to avoid these problems, once he figures out that they only apply to
> single-column FKs.
>
> > An approach which syntactically distinguishes all predicates, while
> leaving them in the same namespace, would address this:
> >
> >  SELECT ?name ?city WHERE {
> >      ?person <PERSON#LNAME> ?name .
> >      ?person <PERSON#LADDRESS> ?aid .
> >      ?address <ADDRESS#LCITYNAME> ?city .
> >      ?address <ADDRESS#LID> ?aid .
> >  }
> > (Note the 'L's preceding the column names.)
> > This is sort of ugly and unpleasant. Maybe we'll find something more
> attractive, but ultimately, if we'll have to sacrifice some simplicity if we
> want to eliminate the unary foreign key exception.
>
> You mean like the one I proposed?
>
> http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2011Aug/0140.html
>
> > The DM is definitely supposed to be usable. The queries and rules we've
> used as examples were just as intuitive as the analogous SQL queries.
> Further, the DM is quite reasonably described in a small bit of RDFS and
> OWL, which is likely to be the languages that informs query builders and
> user-facing interactive browsers.
>
> I don't know what you mean to say here. The DM has not been described in
> RDFS or OWL anywhere as far as I can tell.
>
> > Any approach which requires a table with N foreign keys to have N+1
> schema documents appears to be poor Semantic Web practice.
>
> 1. The DM turns a DB into an RDF graph. It doesn't speak about schema
> documents at all.
> 2. You've never presented a use case that requires schema documents.
> 3. The proposal would require *1 or 2* schema documents per table, not N+1.
> 4. Why is minimizing the number of schema documents a design goal?
> 5. If a DB has 50 tables, then you're ok with a design that increases the
> number of schema documents from 1 to 50 (a factor of 50), but you're not ok
> with a design that takes it from 50 to somewhere around 100 (a factor of 2)?
>
> > The proposals for the many-to-many tables broke every query on those
> tables if something as innocuous as a timestamp were added to the table.
>
> So, it's ok that queries break when an implicit FK constraint is made
> explicit, but it's not ok if queries break when a column is added? By what
> general principle do you decide what kinds of changes are allowed to break
> queries?
>
> Also, with the proposed many-to-many design it would have been *possible*
> to write conservative queries that don't break, because it would have just
> added *additional* convenience triples while retaining the standard DM
> representation of the many-to-many table. Not so with the FK design.
Received on Sunday, 28 August 2011 20:46:08 UTC