- From: Juan Sequeda <juanfederico@gmail.com>
- Date: Wed, 18 May 2011 07:32:54 -0500
- To: Alexandre Bertails <bertails@w3.org>
- Cc: Richard Cyganiak <richard@cyganiak.de>, public-rdb2rdf-wg@w3.org
- Message-ID: <BANLkTinc8JYCOmhNapbxhAyVRRhtaRERxg@mail.gmail.com>
Alexandre, Please see [1] for an example. [1] http://lists.w3.org/Archives/Public/public-rdb2rdf-wg/2011May/0049.html Juan Sequeda +1-575-SEQ-UEDA www.juansequeda.com On Wed, May 18, 2011 at 6:51 AM, Alexandre Bertails <bertails@w3.org> wrote: > On Wed, 2011-05-18 at 12:07 +0100, Richard Cyganiak wrote: > > Hi Juan, > > > > On 18 May 2011, at 05:44, Juan Sequeda wrote: > > > IF the direct mapping has knowledge of the schema then translating > NULLs is not necessary for information preserving. > > > > Yes. > > What do you guys mean by "the direct mapping has knowledge of the > schema"? > > Alexandre. > > > > > > > > > However, the direct mapping as it is in its current version does not > consider the schema at all. > > > > Correct. > > > > > It would be information preserving as-is, if we were to also translate > NULLs. > > > > And this is wrong. For the direct mapping to be information preserving, > we'd have to be able to reconstruct the schema of an EMPTY TABLE after the > table is translated to RDF via the direct mapping. But an empty table > produces NO TRIPLES, and from no triples you cannot reconstruct the original > relational table! > > > > > My proposal would be to extend the direct mapping to consider the > schema and translate it to RDFS/OWL. But I would like to know what other > think. > > > > But can you capture all of the semantics of the SQL model? PKs, FKs, data > types, nullability, > > multiset semantics and so on? Or are you suggesting to do just the > minimal RDFS domain/range thing? > > > > Best, > > Richard > > > > > > > > > > > > > > > Best, > > > Richard > > > > > > > > > On 17 May 2011, at 19:01, Juan Sequeda wrote: > > > > > > > Group, > > > > > > > > By information preserving, I mean that given the RDF data, I can > reconstruct the relational table with all its values. Informally, given an > identity SQL query (a query that outputs the whole table: SELECT * FROM > table), there exist a SPARQL query which is executed on the RDF data and > will return the same results of the identity SQL query. > > > > > > > > There are two cases for information preserving > > > > > > > > 1) We have knowledge the schema > > > > > > > > If the relational schema is directly mapped to RDFS/OWL, then we DO > NOT need to translate nulls in order to preserve information. For example, > consider the table R with attributes A and B and instances: > > > > > > > > R(Bob, NULL) > > > > R(Alice, 25) > > > > > > > > > > > > The ontology from this schema is > > > > > > > > <R> <type> <class> > > > > <A> <type> <property> > > > > <A> <domain> <R> > > > > <A> <range> <whatever datatype> > > > > <B> <type> <property> > > > > <B> <domain> <R> > > > > <B> <range> <whatever datatype> > > > > > > > > And the RDF data, without translating nulls: > > > > > > > > <row1> <R#A> "Bob" > > > > <row2> <R#A> "Alice" > > > > <row2> <R#B> "25" > > > > > > > > The identity SQL query is > > > > > > > > SELECT A, B FROM R > > > > > > > > Given that we know the schema, we can construct a SPARQL query: > > > > > > > > SELECT ?a ?b > > > > WHERE{ > > > > ?x <R#A> ?a > > > > OPTIONAL{ > > > > ?x <R#B> ?B > > > > } > > > > } > > > > > > > > There we go... with that SPARQL query, we can reconstruct the the > original relational table. No need of nulls. If we did triples for NULL > values, then the SPARQL query wouldn't have OPTIONALS. The issue here is > that we don't need triples for NULL values. > > > > > > > > 2) We don't have knowledge of the schema > > > > > > > > If we do not have knowledge of the schema, then we can't create a > SPARQL query like the previous example. Just imagine that you can only look > at the RDF data. For example, consider the following RDF: > > > > > > > > <row1> <R#A> "Bob" > > > > <row2> <R#A> "Alice" > > > > <row2> <R#B> "25" > > > > > > > > > > > > Given that one of the row 2 has <R#B> and row 1 doesn't, I could > guess that the value of row 1 for attribute B is null. But what if the > original table has a column C and every single row has a NULL value for that > column. In this case, it would be necessary to explicitly translate NULL > values into an RDF triple. Otherwise, then the mapping would not be > information preserving. > > > > > > > > > > > > CONCLUSION: > > > > > > > > - At this moment, neither the Direct Mapping or R2RML consider the > schema, therefore in order for the mappings to be Information Preserving we > must explicitly translate NULL values to an RDF triple. > > > > - We need to figure out how is this triple going to show up? > > > > - From a theoretical side, if we do not generate triples for NULL > values, them mapping monotonic. On the other hand, generating triples for > NULL values will make the mapping non-monotonic. Do we care? Not really. But > implementation and performance-wise, there can be some overhead when dealing > with non-monotonicity > > > > > > > > > > > > Juan Sequeda > > > > +1-575-SEQ-UEDA > > > > www.juansequeda.com > > > > > > > > > > > > > > >
Received on Wednesday, 18 May 2011 12:33:52 UTC