Re: New merged consolidated Direct Mapping version from Alexandre Bertails on 2010-11-13 (public-rdb2rdf-wg@w3.org from November 2010)

From: Alexandre Bertails <bertails@w3.org>
Date: Sat, 13 Nov 2010 15:34:35 -0500
To: Juan Sequeda <juanfederico@gmail.com>
Cc: public-rdb2rdf-wg@w3.org
Message-ID: <1289680475.9296.1.camel@simplet>
On Fri, 2010-11-12 at 09:17 -0600, Juan Sequeda wrote:
> Hi Everybody
> 
> 
> Just to remind everybody that the new merged consolidated document can
> be found here:
> 
> 
> http://www.w3.org/2001/sw/rdb2rdf/directMapping/

Looking at the roles of section section 6 Direct Mapping as Rules
and 5 Direct Mapping Definition, I see an easy division between an
axiomatic semantics and an algebra which implements/conforms to that
semantics. As an example, section 6's generateColumnIRI declares a
binding between a lists of column names and the corresponding RDF
predicate IRI. You can view generateColumnIRI without explicit
quantification (quoted from section 6):

  generateColumnIRI(x, y, z): Given a table name x and a non-empty list of columns y, it generates the Column IRI z

or with quantification:

  ∀ r ∈ Table, ∀ columns ∈ [ Column ], ∀ iri ∈ IRI, generateColumnIRI(r, columns, iri) ← nonempty(columns)

The generateColumnIRI rule is *realized* in Section 5's propertyIRI
mapping from a list of columns to an IRI:

  [32] propertyIRI(R, As) ≝ IRI(base + "/" + (join(',', UE(A.name)) ∣ A ∈ As ) "#" As.name)

More formally, given an axiomatic semantics
[[
  ∀ r ∈ Table, ∀ iri ∈ IRI, generateTableIRI(r, iri)
  ∀ r ∈ Table, ∀ columns ∈ [ Column ], ∀ iri ∈ IRI, generateColumnIRI(r, columns, iri) ← nonempty(columns)
  ∀ r ∈ Table, ∀ columns ∈ [ Column ], ∀ values ∈ [ value ], ∀ iri ∈ IRI,
    generateRowIRI(r, columns, values, iri) ← nonempty(columns), nonempty(values)
  ∀ r ∈ Table, ∀ values ∈ [ value ], ∀ bn ∈ BlankNode, generateRowBlankNode(r, values, bn) ← hasNoPrimaryKey(r)
  ∀ r ∈ Table, ∀ column ∈ Column, ∀ value ∈ value, getValue(r, column, value)
  ∀ r ∈ Table, ∀ c1 ∈ Column, ..., ∀ cn ∈ Column, ∀ x1 ∈ value, ..., ∀ xn ∈ value,
    getListValue(r, [c1, ..., cn], [x1, ..., xn]) ← getValue(r, c1, x1), ..., getValue(r, cn, xn)
  (6.1.2 subsumes 6.1.1)
  ∀ s ∈ Subject, ∀ o ∈ Object, ∀ r ∈ Table, ∀ c1 ∈ Column, ..., ∀ cm ∈ Column, ∀ pk ∈ [ Column ], ∀ |pk| ∈ [ value ],
    Triple(s, IRI("rdf:type"), o) ← r(c1, ..., cm),
                                    isPrimaryKey(r, pk),
                                    getListValue(r, pk, |pk|)
                                    generateRowIRI(r, pk, |pk|, s),
                                    generateTableIRI(r, o)
  (6.1.3)
  ∀ s ∈ Subject, ∀ o ∈ Object, ∀ r ∈ Table, ∀ c1 ∈ Column, ..., ∀ cn ∈ Column,
    Triple(s, IRI("rdf:type"), o) ← r(c1, ..., cn),
                                    hasNoPrimaryKey(r),
                                    generateRowBlankNode(r, [c1, ..., cn], s),
                                    generateTableIRI(r, o)
  (6.2.2 subsumes 6.2.1)
    the 2 rules can be factorized as there is no reason to distinguish aj and bj (the conditions are the same)
    the "or" implies a split of the rule
  ∀ s ∈ Subject, ∀ p ∈ Predicate, ∀ xj ∈ value, ∀ r ∈ Table, ∀ c1 ∈ Column, ..., ∀ cm ∈ Column,
                                                             ∀ c ∈ Column, ∀ x ∈ value,
                                                             ∀ pk ∈ [ Column ], ∀ |pk| ∈ [ value ],
    Triple(s, p, x) ← r(c1, ..., cm),
                      isPrimaryKey(r, pk),            // pk is the PK of r
                      in(c, pk),                      // c is a Column in pk
                      isNotForeignKey(r, [ c ]),      // c is not the only constituent of a foreign key of r
                      getListValue(r, pk, |pk|)
                      generateRowIRI(r, pk, |pk|, s),
                      generateColumnIRI(r, [ c ], p),
                      getValue(r, c, x)
    Triple(s, p, x) ← r(c1, ..., cm),
                      isPrimaryKey(r, pk),             // pk is the PK of r
                      in(c, pk),                       // c is a Column in pk
                      isForeignKey(r, [ c ]),          // c is the only constituent of a foreign key of r
                      references(r, [ c ], r', ck),    // c references a candidate key ck in another table
                      isPrimaryKey(r', ck),            // ck is the PK of this other table
                      getListValue(r, pk, |pk|),
                      generateRowIRI(r, pk, |pk|, s),
                      generateColumnIRI(r, [ c ], p),
                      getValue(r, c, x)
  (6.2.3)
  ∀ s ∈ Subject, ∀ p ∈ Predicate, ∀ r ∈ Table, ∀ c1 ∈ Column, ..., ∀ cm ∈ Column, ∀ c ∈ Column, ∀ x ∈ value,
    Triple(s, p, x) ← r(c1, ..., cn),
                      hasNoPrimaryKey(r),
                      generateRowBlankNode(r, [c1, ..., cn], s),
                      in(c, [c1, ..., cn]),
                      generateColumnIRI(r, [ c ], p),
                      getValue(r, c, x)
]]

and an algebra:

[[
[1]    Database        ≝  { TableName → Table }
[2]    Table           ≝  ( Header, [CandidateKey], CandidateKey?, ForeignKeys, Body )
[3]    Header          ≝  { ColumnName → SQLDatatype }
[4]    CandidateKey    ≝  [ ColumnName ]
[5]    ForeignKeys     ≝  { [ColumnName] → ( Table, [ColumnName] ) }
[6]    SQLDatatype     ≝  { INT | FLOAT | DATE | TIME | TIMESTAMP | CHAR | VARCHAR | STRING }
[7]    Body            ≝  [ Tuple ]
[8]    Tuple           ≝  { ColumnName → CellValue }
[9]    CellValue       ≝  value | Null

[10]   Graph           ≝  { Triple }
[11]   Triple          ≝  ( Subject, Predicate, Object )
[12]   Subject         ≝  IRI | BlankNode
[13]   Predicate       ≝  IRI
[14]   Object          ≝  IRI | BlankNode | Literal
[15]   IRI             ≝  RDF URI-reference as subsequently restricted by SPARQL
[16]   BlankNode       ≝  RDF blank node
[17]   Literal         ≝  PlainLiteral | TypedLiteral
[18]   PlainLiteral    ≝  (lexicalForm) | (lexicalForm, langageTag)
[19]   TypedLiteral    ≝  (lexicalForm, IRI)
]]

, one could show that the algebra fits the axiomatic semantics. In
"Data Exchange: Semantics and Query Answering", Fagin et al. focused
on separating the axiomatic semantics (which they call the "universal
solution") from their data exchange algorithms.

Alexandre.


> 
> 
> Old versions of the document are:
> 
> 
> http://www.w3.org/2001/sw/rdb2rdf/directGraph/
> http://www.w3.org/2001/sw/rdb2rdf/directGraph/alt
> 
> 
> 
> 
> Looking forward to your comments
> 
> 
> Juan Sequeda
> +1-575-SEQ-UEDA
> www.juansequeda.com
>
Received on Saturday, 13 November 2010 20:34:43 UTC