- From: Alexandre Bertails <bertails@w3.org>
- Date: Sat, 13 Nov 2010 16:37:25 -0500
- To: Juan Sequeda <juanfederico@gmail.com>
- Cc: "public-rdb2rdf-wg@w3.org" <public-rdb2rdf-wg@w3.org>
On Sat, 2010-11-13 at 14:47 -0600, Juan Sequeda wrote: > I'd like to go through this thoroughly but I believe this looks a lot like: > > http://www.w3.org/2001/sw/rdb2rdf/wiki/Database-Instance-Only_and_Database-Instances-and-Schema_Mapping > > This was Marcelo and my proposal a longggg time ago. Yes, Eric made me read it a longggg time ago :-) But this is not the same approach (and I prefer the one you took in the merged document). In the merged spec, you say things like [[ Assume that r(a, b1, ..., bn) is a table with columns a, b1, ..., bn ... ]]. It's not clear if it means "I have a function from a relation r in RDB to a Datalog rule", or if you are giving an axiomatic description of the truth in a particular case. I understood it as an axiomatic description with universal quantification (the universe of discourse, which is also missing in your rules) because as there is no reason to keep two models of computation in the same spec, I assumed you were not competing with the mapping (which I recall is by definition a function) itself by proposing a new one. And if this was actually a function from RDB to Datalog, I would have expected to see the formal definition of a function with a clear domain and codomain. So to be sure I was understanding your rules, I spontaneously started to annotate the variables and then, to get rid of the English (I always have a problem to consider descriptions in English as they escape from the formalism and hide the difficulty), I pushed the plain-text constraints into the rules, one after one. I found very pleasant to see that you actually use Higher Order Logic (the [[ Assume that ]] were the clue but I did not get it right away). By putting more formalism into the rules, I really understood you were giving a nice semantical framework for the Direct Mapping, more than giving a way to compute it. The icing on the cake is that you never have to say *how* you compute an IRI, for example. You just have to say that it exists! The algebra tells you the "what" (the Abstract Models) and the "how" (the mapping functions), whereas your Axiomatic Semantics tells you the truth in the model. May I suggest the editors (Eric, that includes you) to make clear the relation between the Direct Mapping (the algebra) and its Axiomatic Semantics? Alexandre. > > Juan Sequeda > www.juansequeda.com > > On Nov 13, 2010, at 2:34 PM, Alexandre Bertails <bertails@w3.org> wrote: > > > On Fri, 2010-11-12 at 09:17 -0600, Juan Sequeda wrote: > >> Hi Everybody > >> > >> > >> Just to remind everybody that the new merged consolidated document can > >> be found here: > >> > >> > >> http://www.w3.org/2001/sw/rdb2rdf/directMapping/ > > > > Looking at the roles of section section 6 Direct Mapping as Rules > > and 5 Direct Mapping Definition, I see an easy division between an > > axiomatic semantics and an algebra which implements/conforms to that > > semantics. As an example, section 6's generateColumnIRI declares a > > binding between a lists of column names and the corresponding RDF > > predicate IRI. You can view generateColumnIRI without explicit > > quantification (quoted from section 6): > > > > generateColumnIRI(x, y, z): Given a table name x and a non-empty list of columns y, it generates the Column IRI z > > > > or with quantification: > > > > ∀ r ∈ Table, ∀ columns ∈ [ Column ], ∀ iri ∈ IRI, generateColumnIRI(r, columns, iri) ← nonempty(columns) > > > > The generateColumnIRI rule is *realized* in Section 5's propertyIRI > > mapping from a list of columns to an IRI: > > > > [32] propertyIRI(R, As) ≝ IRI(base + "/" + (join(',', UE(A.name)) ∣ A ∈ As ) "#" As.name) > > > > More formally, given an axiomatic semantics > > [[ > > ∀ r ∈ Table, ∀ iri ∈ IRI, generateTableIRI(r, iri) > > ∀ r ∈ Table, ∀ columns ∈ [ Column ], ∀ iri ∈ IRI, generateColumnIRI(r, columns, iri) ← nonempty(columns) > > ∀ r ∈ Table, ∀ columns ∈ [ Column ], ∀ values ∈ [ value ], ∀ iri ∈ IRI, > > generateRowIRI(r, columns, values, iri) ← nonempty(columns), nonempty(values) > > ∀ r ∈ Table, ∀ values ∈ [ value ], ∀ bn ∈ BlankNode, generateRowBlankNode(r, values, bn) ← hasNoPrimaryKey(r) > > ∀ r ∈ Table, ∀ column ∈ Column, ∀ value ∈ value, getValue(r, column, value) > > ∀ r ∈ Table, ∀ c1 ∈ Column, ..., ∀ cn ∈ Column, ∀ x1 ∈ value, ..., ∀ xn ∈ value, > > getListValue(r, [c1, ..., cn], [x1, ..., xn]) ← getValue(r, c1, x1), ..., getValue(r, cn, xn) > > (6.1.2 subsumes 6.1.1) > > ∀ s ∈ Subject, ∀ o ∈ Object, ∀ r ∈ Table, ∀ c1 ∈ Column, ..., ∀ cm ∈ Column, ∀ pk ∈ [ Column ], ∀ |pk| ∈ [ value ], > > Triple(s, IRI("rdf:type"), o) ← r(c1, ..., cm), > > isPrimaryKey(r, pk), > > getListValue(r, pk, |pk|) > > generateRowIRI(r, pk, |pk|, s), > > generateTableIRI(r, o) > > (6.1.3) > > ∀ s ∈ Subject, ∀ o ∈ Object, ∀ r ∈ Table, ∀ c1 ∈ Column, ..., ∀ cn ∈ Column, > > Triple(s, IRI("rdf:type"), o) ← r(c1, ..., cn), > > hasNoPrimaryKey(r), > > generateRowBlankNode(r, [c1, ..., cn], s), > > generateTableIRI(r, o) > > (6.2.2 subsumes 6.2.1) > > the 2 rules can be factorized as there is no reason to distinguish aj and bj (the conditions are the same) > > the "or" implies a split of the rule > > ∀ s ∈ Subject, ∀ p ∈ Predicate, ∀ xj ∈ value, ∀ r ∈ Table, ∀ c1 ∈ Column, ..., ∀ cm ∈ Column, > > ∀ c ∈ Column, ∀ x ∈ value, > > ∀ pk ∈ [ Column ], ∀ |pk| ∈ [ value ], > > Triple(s, p, x) ← r(c1, ..., cm), > > isPrimaryKey(r, pk), // pk is the PK of r > > in(c, pk), // c is a Column in pk > > isNotForeignKey(r, [ c ]), // c is not the only constituent of a foreign key of r > > getListValue(r, pk, |pk|) > > generateRowIRI(r, pk, |pk|, s), > > generateColumnIRI(r, [ c ], p), > > getValue(r, c, x) > > Triple(s, p, x) ← r(c1, ..., cm), > > isPrimaryKey(r, pk), // pk is the PK of r > > in(c, pk), // c is a Column in pk > > isForeignKey(r, [ c ]), // c is the only constituent of a foreign key of r > > references(r, [ c ], r', ck), // c references a candidate key ck in another table > > isPrimaryKey(r', ck), // ck is the PK of this other table > > getListValue(r, pk, |pk|), > > generateRowIRI(r, pk, |pk|, s), > > generateColumnIRI(r, [ c ], p), > > getValue(r, c, x) > > (6.2.3) > > ∀ s ∈ Subject, ∀ p ∈ Predicate, ∀ r ∈ Table, ∀ c1 ∈ Column, ..., ∀ cm ∈ Column, ∀ c ∈ Column, ∀ x ∈ value, > > Triple(s, p, x) ← r(c1, ..., cn), > > hasNoPrimaryKey(r), > > generateRowBlankNode(r, [c1, ..., cn], s), > > in(c, [c1, ..., cn]), > > generateColumnIRI(r, [ c ], p), > > getValue(r, c, x) > > ]] > > > > and an algebra: > > > > [[ > > [1] Database ≝ { TableName → Table } > > [2] Table ≝ ( Header, [CandidateKey], CandidateKey?, ForeignKeys, Body ) > > [3] Header ≝ { ColumnName → SQLDatatype } > > [4] CandidateKey ≝ [ ColumnName ] > > [5] ForeignKeys ≝ { [ColumnName] → ( Table, [ColumnName] ) } > > [6] SQLDatatype ≝ { INT | FLOAT | DATE | TIME | TIMESTAMP | CHAR | VARCHAR | STRING } > > [7] Body ≝ [ Tuple ] > > [8] Tuple ≝ { ColumnName → CellValue } > > [9] CellValue ≝ value | Null > > > > [10] Graph ≝ { Triple } > > [11] Triple ≝ ( Subject, Predicate, Object ) > > [12] Subject ≝ IRI | BlankNode > > [13] Predicate ≝ IRI > > [14] Object ≝ IRI | BlankNode | Literal > > [15] IRI ≝ RDF URI-reference as subsequently restricted by SPARQL > > [16] BlankNode ≝ RDF blank node > > [17] Literal ≝ PlainLiteral | TypedLiteral > > [18] PlainLiteral ≝ (lexicalForm) | (lexicalForm, langageTag) > > [19] TypedLiteral ≝ (lexicalForm, IRI) > > ]] > > > > , one could show that the algebra fits the axiomatic semantics. In > > "Data Exchange: Semantics and Query Answering", Fagin et al. focused > > on separating the axiomatic semantics (which they call the "universal > > solution") from their data exchange algorithms. > > > > Alexandre. > > > > > >> > >> > >> Old versions of the document are: > >> > >> > >> http://www.w3.org/2001/sw/rdb2rdf/directGraph/ > >> http://www.w3.org/2001/sw/rdb2rdf/directGraph/alt > >> > >> > >> > >> > >> Looking forward to your comments > >> > >> > >> Juan Sequeda > >> +1-575-SEQ-UEDA > >> www.juansequeda.com > >> > > > > > > > > > > > > > > >
Received on Saturday, 13 November 2010 21:37:33 UTC