ISSUE-9 Another question about Generate Blank Nodes

This may be something we have talked about, so sorry if I'm asking about
something that already has an answer.

We assume that a table that does not have a primary key will have a blank
node as the Row identifier for each tuple.

But what happens if the table does not have a primary key but does have a
candidate key(s). Are we still generating a blank node as the Row identifier
for each tuple? Or could we consider building an IRI with the candidate
keys?

Consider the following example

Schema
Projects(lead, name, deptName, deptCity) where UNIQUE(name, deptName,
deptCity)

Instances
Projects(8, pencil survey, accounting, cambridge)
Projects(8, eraser survey, accounting, cambridge)

For each tuple we could create a fresh blank node, or we could create a Row
IRI for each tuple using the candidate key :

<Projects/name=pencil survey,deptName=accounting,deptCity=cambridge>
<Projects/name=eraser survey,deptName=accounting,deptCity=cambridge>

These IRIs are unique because they come from unique keys.

What is the consensus here. I do not think this case is covered in the
current direct mapping doc (right Eric?)

Cheers

Juan Sequeda
+1-575-SEQ-UEDA
www.juansequeda.com


On Fri, Jan 21, 2011 at 2:41 PM, RDB2RDF Working Group Issue Tracker <
sysbot+tracker@w3.org <sysbot%2Btracker@w3.org>> wrote:

>
> ISSUE-9 (bn_directmapping): Generate Blank Nodes for duplicate tuples
> [Direct Mapping]
>
> http://www.w3.org/2001/sw/rdb2rdf/track/issues/9
>
> Raised by: Juan Sequeda
> On product: Direct Mapping
>
> Given a table that does not have a primary key, which has duplicate tuples,
> a different blank node must be created for each tuple.
>
> In the Direct Mapping as rules section of the Direct Mapping document, we
> described this scenario by using all the values of the tuple to create the
> blank node [1] [2]. However, there is a bug, raised by Alexandre [3]. The
> issue is that datalog cannot deal with duplicate. Consequently, Marcelo
> raised the point that we can use simple versions of datalog that can deal
> with duplicate solutions.
>
> Possible solutions:
>
> 1) assume that each table implicitly has a row id which is part of its set
> of attributes. The row id is unique.
> 2) associates to each tuple an annotation that corresponds to the
> multiplicity of the tuple in the database. This annotation function
> corresponds to the function card in the definition of the semantics of
> SPARQL
>
>
> [1]
> http://www.w3.org/TR/2010/WD-rdb-direct-mapping-20101118/#rules_table_triples_no_pk
> [2]
> http://www.w3.org/TR/2010/WD-rdb-direct-mapping-20101118/#rules_literal_triples_no_pk
> [3]
> http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2011Jan/0044.html
>
>
>
>

Received on Monday, 31 January 2011 16:12:05 UTC