Re: ISSUE-9 Another question about Generate Blank Nodes from Ivan Herman on 2011-02-02 (public-rdb2rdf-wg@w3.org from February 2011)

From: Ivan Herman <ivan@w3.org>
Date: Wed, 2 Feb 2011 10:57:46 +0100
To: Sören Auer <auer@informatik.uni-leipzig.de>
Cc: Juan Sequeda <juanfederico@gmail.com>, "Eric Prud'hommeaux" <eric@w3.org>, RDB2RDF Working Group WG <public-rdb2rdf-wg@w3.org>
Message-Id: <C43A565A-5B09-4D7B-B4B9-5DFDB8C3260C@w3.org>

Hi Sören,

I must admit, much that I do understand the issues around bnodes, I am not convinced.

You might call it 'philosophical', but for my limited RDB knowledge and experience if I have a table with no candidate key, then the fact of using an anonymous node seems to be the true representation of the table. Without going into the intricacies  of semantics, bnodes can be looked at as anonymous nodes. If an application wants to use real URIs, then the Direct Graph can be transformed/sparql-constructed accordingly as a second step...

But I am also a bit afraid of the algorithm you provide. Again, I may be wrong in the way, say, Virtuoso works, but what if one inserts a row into a table. Wouldn't the internal row identifier change? If so, wouldn't that mean that, after several 'runs', you would get totally different triples for the same URI? Let alone that fact that these URI-s may not be dereferenceable, this may create issues, shouldn't it? After all, how would an application know that that particular URI is, shall we say, a fake bnode?

A much safer way would be to use some UUID-based URI scheme that somehow makes it explicit that (a) it is to represent an anonymous node and (b) each 'run' would really produce different results ('b' is secured by UUID). But I am not sure we would gain too much.

Ivan

On Feb 1, 2011, at 23:11 , Sören Auer wrote:

> Hi all,
> 
> In todays telco several people (including Souri and me) supported the idea to abandon the use of blank notes. Is there any fundamental reason (beside philosopical views) to use blank nodes?
> If not I suggest we just generate IRIs for all resources. Of course this does not yet solve the problem of how they should be created, but we could follow the following strategy:
> 
> * if there is a candidate key use the candidate key,
> * if there is no candidate key, but an internal row identifier (e.g. Virtuoso has such one always) use this row identifier,
> * if nether one exists, generate an identifier using a hash function over all values of the row + an incremented counter in case duplicate rows exist
> 
> Wouldn't this be a simple and effective solution to the problem?
> 
> Best,
> 
> Sören
> 

----
Ivan Herman, W3C Semantic Web Activity Lead
Home: http://www.w3.org/People/Ivan/
mobile: +31-641044153
PGP Key: http://www.ivan-herman.net/pgpkey.html
FOAF: http://www.ivan-herman.net/foaf.rdf

Attachments

application/pkcs7-signature attachment: smime.p7s

Received on Wednesday, 2 February 2011 09:57:32 UTC