Re: ISSUE-9 Another question about Generate Blank Nodes from Sören Auer on 2011-02-01 (public-rdb2rdf-wg@w3.org from February 2011)

From: Sören Auer <auer@informatik.uni-leipzig.de>
Date: Wed, 02 Feb 2011 00:33:37 +0100
To: Alexandre Bertails <bertails@w3.org>
CC: Juan Sequeda <juanfederico@gmail.com>, "Eric Prud'hommeaux" <eric@w3.org>, RDB2RDF Working Group WG <public-rdb2rdf-wg@w3.org>
Message-ID: <4D489851.9090506@informatik.uni-leipzig.de>

Am 01.02.2011 23:54, schrieb Alexandre Bertails:
>> * if there is a candidate key use the candidate key,
>
> What if they are several ones? What if there is a NULL value? Does it
> make really sense to map a row to some arbitrary candidate key?

 From my point of view it does, since then you can talk about (i.e. link 
from the Data Web to) that row.

>> * if there is no candidate key, but an internal row identifier (e.g.
>> Virtuoso has such one always) use this row identifier,
>
> You can be tempted to use the row identifier but this one must remain
> hidden as it's not exposed in SQL. It's only accessible by the database
> vendor, not by the guys relying only on SQL, like in our prototype.

They might be available by means of a stored procedure or defined 
function. If they are, why not using them?

>> * if nether one exists, generate an identifier using a hash function
>> over all values of the row + an incremented counter in case duplicate
>> rows exist
>
> "incremented counter" sound like a side-effect to me :-) I'm interested
> to know how you will simply translate that in a mathematical function
> (ie. the output depends only on the input).

Why is that required? Since the duplicate rows are not distinguishable 
anyway it also doesn't matter if their identifiers are permuted.

> Anyway, the behaviour of such a URI mimics so much the semantics of a
> Blank Node that I really prefer to see a real Blank Node instead.

I have the impression the semantics of blank nodes is rather unclear and 
debated. From a practical point of view the only difference between a 
blanknode and an IRI is that blank nodes are not unique globally and 
thus not really usable with Linked Data (which to support is one of the 
tasks per our charter).

Have a good night,

Sören

Received on Tuesday, 1 February 2011 23:34:10 UTC