W3C home > Mailing lists > Public > public-rdb2rdf-comments@w3.org > November 2010

Re: primary key that's already a uri?

From: Drew Perttula <drewp@bigasterisk.com>
Date: Sat, 20 Nov 2010 19:35:13 -0800
Message-ID: <4CE89371.1040106@bigasterisk.com>
To: public-rdb2rdf-comments@w3.org
On 11/19/2010 08:00 AM, Ted Thibodeau Jr wrote:
> Drew, all --
>
>
> On Nov 19, 2010, at 02:50 AM, Drew Perttula wrote:
>
>> http://www.w3.org/TR/2010/WD-rdb-direct-mapping-20101118/#row_iri
>>
>> "The IRI that identifies a row is created by  ..."
>>
>> Please consider the cases where people do have the freedom to use a complete URI as the primary key in their RDB. When I can build my RDB with RDF mapping in mind, that is the most natural thing to do. I would like to see an alternate row node scheme that simply takes the PK string and treats it as the row node.
>
> If nothing else, this option should not be part of the *Direct
> Mapping*.  The Direct Mapping *cannot* take into account *any* of
> the table content.

I have no objection to this kind of layered design, but it's a little 
unclear to say you're not using the table content. The PK values are in 
the table. For all you know, they could be strings, and they could even 
be strings starting with 'http://'. I realize that plain ints are the 
most popular PK type, but I suspect that RDF-inclined people use URIs 
for their PK more often than the general population :)

> Both RowIDs and Columns must be explicitly tied to their containing
> TableID, as there will always be multiple RowID=1, and there may
> as well always be multiple ColumnName='ID' ... and likewise, TableID
> must be explicitly tied to containing Schema/Owner, and Catalog, and
> ultimately DBMS-Instance, but this should be obvious to anyone
> spending more than a few minutes on the question, and exactly how
> the ties are made may be left as an implementation detail.

It sounds like you're focusing on the uniqueness requirement of the 
generated row nodes, but all that careful inclusion of 
row/table/schema/etc will be undone when I use the same base_IRI for two 
instances. This isn't broken; I'm just pointing out that you (the 
mapper) are always relying on me (the developer) for your final 
uniqueness by trusting me to make separate base_IRIs for different 
databases. If I say "the string in the PK column is already globally 
unique", you can trust me on that too-- it doesn't improve anything to 
continue to mandate that the TableID be included in that case.

Again, I don't mind if that kind of additional configuration is in 
another design layer on top of the basic mapper. It might be worth 
including the rules and limitations for base_IRI in the present doc, 
however. Stuff like:

"base_IRI is responsible for the uniqueness of the row nodes",

"poorly-chosen base_IRIs may jeopardize uniqueness, e.g. if database1's 
base_IRI is http://example.com/ and database2's base_IRI is 
http://example.com/employee, there could be a clash with the generated 
IRI for database1's 'employee' table"

etc.
Received on Sunday, 21 November 2010 03:35:47 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Sunday, 21 November 2010 03:35:48 GMT