Re: Proposal for ISSUE-65 from Richard Cyganiak on 2011-08-28 (public-rdb2rdf-wg@w3.org from August 2011)

From: Richard Cyganiak <richard@cyganiak.de>
Date: Sun, 28 Aug 2011 15:21:09 +0100
To: Eric Prud'hommeaux <eric@w3.org>
Cc: Juan Sequeda <juanfederico@gmail.com>, public-rdb2rdf-wg@w3.org
Message-Id: <BFE9100D-5D60-48F4-ACCC-66E659F285E6@cyganiak.de>
Hi Eric,

On 27 Aug 2011, at 23:52, Eric Prud'hommeaux wrote:
>> The case for Uniformity is stronger than that: All columns, always, are mapped in the same predictable way; with the single exception of unary foreign keys.
> 
> Given that all values are currently extractable, I think this needs to be cast as a use case that by includes a user and a learning curve:
> 
>  User Joe has understands the pattern of querying database columns, doesn't know the unary foreign key exception.
[…]

I'm more concerned about the case where Joe builds his application, and then Bob the DBA, who knows nothing of Joe's work, makes an implicit foreign key constraint explicit, and Joe's application breaks.

I'm more concerned about the case where Joe uses some off-the-shelf front-end on top of the DM (such as Pubby or Linked Data Pages or some generic Linked Data browser), and that front-end just shows the directly attached properties when displaying a resource, and therefore the LOINC code of a disease outcome is not shown.

I'm more concerned about the case where Joe *thinks* that performance suffers because his SPARQL query requires an extra triple pattern to access the LOINC code, which in his mind translate to an extra join (whatever the implementation *actually* does under the hood).

I'm more concerned about Joe adding a dummy second column to his FKs just to avoid these problems, once he figures out that they only apply to single-column FKs.

> An approach which syntactically distinguishes all predicates, while leaving them in the same namespace, would address this:
> 
>  SELECT ?name ?city WHERE {
>      ?person <PERSON#LNAME> ?name .
>      ?person <PERSON#LADDRESS> ?aid .
>      ?address <ADDRESS#LCITYNAME> ?city .
>      ?address <ADDRESS#LID> ?aid .
>  }
> (Note the 'L's preceding the column names.)
> This is sort of ugly and unpleasant. Maybe we'll find something more attractive, but ultimately, if we'll have to sacrifice some simplicity if we want to eliminate the unary foreign key exception.

You mean like the one I proposed?

http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2011Aug/0140.html

> The DM is definitely supposed to be usable. The queries and rules we've used as examples were just as intuitive as the analogous SQL queries. Further, the DM is quite reasonably described in a small bit of RDFS and OWL, which is likely to be the languages that informs query builders and user-facing interactive browsers.

I don't know what you mean to say here. The DM has not been described in RDFS or OWL anywhere as far as I can tell.

> Any approach which requires a table with N foreign keys to have N+1 schema documents appears to be poor Semantic Web practice.

1. The DM turns a DB into an RDF graph. It doesn't speak about schema documents at all.
2. You've never presented a use case that requires schema documents.
3. The proposal would require *1 or 2* schema documents per table, not N+1.
4. Why is minimizing the number of schema documents a design goal?
5. If a DB has 50 tables, then you're ok with a design that increases the number of schema documents from 1 to 50 (a factor of 50), but you're not ok with a design that takes it from 50 to somewhere around 100 (a factor of 2)?

> The proposals for the many-to-many tables broke every query on those tables if something as innocuous as a timestamp were added to the table.

So, it's ok that queries break when an implicit FK constraint is made explicit, but it's not ok if queries break when a column is added? By what general principle do you decide what kinds of changes are allowed to break queries?

Also, with the proposed many-to-many design it would have been *possible* to write conservative queries that don't break, because it would have just added *additional* convenience triples while retaining the standard DM representation of the many-to-many table. Not so with the FK design.
Received on Sunday, 28 August 2011 14:21:39 UTC