Re: Comments on Eric's Section 2 from Eric Prud'hommeaux on 2010-11-09 (public-rdb2rdf-wg@w3.org from November 2010)

From: Eric Prud'hommeaux <eric@w3.org>
Date: Tue, 9 Nov 2010 09:02:46 -0500
To: Richard Cyganiak <richard@cyganiak.de>
Cc: RDB2RDF WG <public-rdb2rdf-wg@w3.org>
Message-ID: <20101109140244.GA16107@w3.org>
* Richard Cyganiak <richard@cyganiak.de> [2010-11-09 14:35+0800]
...
* Richard Cyganiak <richard@cyganiak.de> [2010-11-09 14:46+0800]

enjoying your network?

> 
> On 9 Nov 2010, at 12:12, Eric Prud'hommeaux wrote:
> >I propose to get consensus on your FPWD proposal first, then address
> >the #_ issue in all of the examples. An alternative viewpoint came
> >from DanC in his review:
> >[[
> >2010-10-29T17:43:19Z <DanC> yay for #_
> >]]
> 
> Can you please state why you put the #_ there. You keep refusing to
> provide a rationale for the inclusion of this. If you're not
> prepared to provide rationale, then it should go. “Yay” does not
> count as rationale.

How about "cool" or "gnarly"?
[[
Issue (hash-vs-slash):

The direct graph may be offered as Linked Open Data, raising the issue
of distinguishing row identifers from the information resources which
describe them. This edition of this document presumes hash
identifiers, allowing a GET on a row identifier to retrieve a small
resource (i.e. not all rows from the same table) and distinguish
between the retrieved resource People/ID=7 and the row
People/ID=7#_. The "slash" alternative would offer a direct graph with
identifiers like People/ID=7 but would demand the server respond to
GET /People/ID=7 with a 303 redirect to some other resource.
]]
plus a link to
  http://www.w3.org/2001/tag/doc/httpRange-14/2007-05-31/HttpRange-14 
which is ugly, but more informative than e.g.
  http://www.w3.org/2001/tag/group/track/issues/14

> >>4. Inconsistency: Section 2.2 states that predicate IRIs have
> >>hashes, while all the examples have slashes.
> >
> >fixed (if we're speaking of the same place)
> 
> I see this in the current version:
> 
> [[
> IRIs in the predicate position are composed by concatenating:
>  • the stem
>  • '/'
>  • the url-encoded (per WSDL urlEncoded [WSDL]) table name
>  • '/'
>  • the url-encoded column names, separated by a '_'
> For example, a table called People with a foreign key with the
> column names fname, lname would produce the IRI
> <http://foo.example/DB/People#fname_lname>
> ]]
> 
> The rule would produce a different result from the example.

Ahh, that '#'.. fixed.

After today, I'd like to address relativizing the direct graph.

> >>8. In order to have an improved narrative in the section titles, I
> >>propose splitting 2.2 into one section “Identifiers for rows and
> >>columns” and one section “Row mapping rules”. (Not essential for
> >>FPWD)
> >
> >I believe the current version has more structure and bolding than what
> >you reviwed. Has this addressed your comment?
> 
> It's improved.
> 
> -1 to the subsubsections like 2.2.1. This is not a long document;
> you should be able to do this with just two levels of headings.

noted, but I'd like to address that after today.

> >>9. Section 2.5: “Hierarchies” can refer to many things in an SQL
> >>context, so it's a bit hard to figure out what the section refers
> >>to. The first sentence should perhaps talk about “hierarchies of
> >>tables that represent specializations of the same concept” or
> >>something similar.
> >
> >Is
> >[[
> >It is common to express specializations of some concept as mutiple
> >tables sharing a common primkary key.
> >]]
> >sufficient?
> 
> Yes, thanks.
> 
> >[[
> >As a counter example, a Wedding table may
> >have exactly two spouses but it's still not a many-to-many relation in
> >most places.
> >]]
> 
> I don't understand why this is a counter-example. The question
> whether a row in a marriage table is represented as one triple or as
> one resource with two triples is completely independent from the
> question whether the relationship is many-to-many or not. These are
> independent questions. In fact, the whole issue shouldn't be named
> “many-to-many” because it has nothing to do with questions of
> cardinality. It's about the question what to do with tables that
> represent relationships and not entities.

If you mean binary relationships then I think I see your point. Could
you supply some text? I don't think I'm capturing your intent here.

> You state there that detecting these tables is hard. That is not
> true. You have already provided a test in an earlier mail.

Our best litmus for relationship tables seems to be that they have
exactly two foreign keys which encompass all of the attributes. In
<http://www.w3.org/mid/20101103204332.GH18650@w3.org>, Mike offered
a counter example of a Marriage table which has two foriegn keys,
but is not a one-way relationship between spouse1 and spouse2.

> >>11. See my comments to Juan and Marcelo asking for inclusion of
> >>table IRIs and of a triple that associates each row to its table.
> >>I'd really like to see a proposal for this in the FPWD, but at least
> >>an issue box would be essential. I note that the directGraph/alt
> >>version already has this.
> >
> >The foreign-key-is-candidate-key situation *appears* to imply that the
> >same node is defined across multiple tables; saying that it's an
> >Address or an Office won't give you the critical information which is
> >what predicates came from what table.
> 
> I don't have a requirement for knowing which columns come from what
> table.
> 
> >I propose instead:
> >
> ><Offices#building> rdb2rdf:inTable <#Offices> .
> >and maybe
> ><Offices> rdb2rdf:inDatabase <> .
> 
> These don't address what I asked for. I want to be able to query for
> all resources generated from a single table in an obvious way.
> 
> >That way you can separate which triples came from which tables.
> 
> I want to know this for resources, not for triples. Hierarchical
> tables are a rare corner case and I don't particularly care how that
> edge case is handled as long as the common case remains simple.

In <http://www.w3.org/2001/sw/rdb2rdf/directGraph/#hier-tabl>, what
table would I attach to <Addresses/ID=18#_> to?

> >You
> >can get the type effect where you want it by asserting
> ><Offices#building> rdfs:domain <#Offices_row> .
> 
> What do you mean here? Do you suggest to make that triple part of
> the direct mapping and state that implementations MUST query the
> RDFS inference closure of the graph? That would address my
> requirement. Otherwise I don't know how I would “assert that
> triple”.

If you are actually querying the graph, then no, property2table
association would not require inference:

  SELECT ?office { ?p rdb2rdf:inTable <#Offices> . ?office ?p ?o }

If you're using say jena API, 
  listSubjectsWithProperty(<Offices>)
then yes, you'd need to add the domain triple and traverse the closure.

I'm not sure what use case you have in mind here and whether it
demands that resources described by multiple tables be associated each
table. I believe that the provenance of triples is a *very* important
use case, and that providing rdb2rdf:inTable links solves that and yours.


> Best,
> Richard

-- 
-ericP
Received on Tuesday, 9 November 2010 14:03:24 UTC