- From: Michael Hausenblas <michael.hausenblas@deri.org>
- Date: Tue, 17 May 2011 19:15:52 +0100
- To: Juan Sequeda <juanfederico@gmail.com>
- Cc: public-rdb2rdf-wg@w3.org
Juan, All, Just a procedural comment: if you post (especially when you open a new thread) please mention the related issue somewhere (subject or text or both) so that the tracker can, well, keep track of it ;) Tracker this is related to ISSUE-41. Cheers, Michael -- Dr. Michael Hausenblas, Research Fellow LiDRC - Linked Data Research Centre DERI - Digital Enterprise Research Institute NUIG - National University of Ireland, Galway Ireland, Europe Tel. +353 91 495730 http://linkeddata.deri.ie/ http://sw-app.org/about.html On 17 May 2011, at 19:01, Juan Sequeda wrote: > Group, > > By information preserving, I mean that given the RDF data, I can > reconstruct the relational table with all its values. Informally, > given an identity SQL query (a query that outputs the whole table: > SELECT * FROM table), there exist a SPARQL query which is executed > on the RDF data and will return the same results of the identity SQL > query. > > There are two cases for information preserving > > 1) We have knowledge the schema > > If the relational schema is directly mapped to RDFS/OWL, then we DO > NOT need to translate nulls in order to preserve information. For > example, consider the table R with attributes A and B and instances: > > R(Bob, NULL) > R(Alice, 25) > > > The ontology from this schema is > > <R> <type> <class> > <A> <type> <property> > <A> <domain> <R> > <A> <range> <whatever datatype> > <B> <type> <property> > <B> <domain> <R> > <B> <range> <whatever datatype> > > And the RDF data, without translating nulls: > > <row1> <R#A> "Bob" > <row2> <R#A> "Alice" > <row2> <R#B> "25" > > The identity SQL query is > > SELECT A, B FROM R > > Given that we know the schema, we can construct a SPARQL query: > > SELECT ?a ?b > WHERE{ > ?x <R#A> ?a > OPTIONAL{ > ?x <R#B> ?B > } > } > > There we go... with that SPARQL query, we can reconstruct the the > original relational table. No need of nulls. If we did triples for > NULL values, then the SPARQL query wouldn't have OPTIONALS. The > issue here is that we don't need triples for NULL values. > > 2) We don't have knowledge of the schema > > If we do not have knowledge of the schema, then we can't create a > SPARQL query like the previous example. Just imagine that you can > only look at the RDF data. For example, consider the following RDF: > > <row1> <R#A> "Bob" > <row2> <R#A> "Alice" > <row2> <R#B> "25" > > > Given that one of the row 2 has <R#B> and row 1 doesn't, I could > guess that the value of row 1 for attribute B is null. But what if > the original table has a column C and every single row has a NULL > value for that column. In this case, it would be necessary to > explicitly translate NULL values into an RDF triple. Otherwise, then > the mapping would not be information preserving. > > > CONCLUSION: > > - At this moment, neither the Direct Mapping or R2RML consider the > schema, therefore in order for the mappings to be Information > Preserving we must explicitly translate NULL values to an RDF triple. > - We need to figure out how is this triple going to show up? > - From a theoretical side, if we do not generate triples for NULL > values, them mapping monotonic. On the other hand, generating > triples for NULL values will make the mapping non-monotonic. Do we > care? Not really. But implementation and performance-wise, there can > be some overhead when dealing with non-monotonicity > > > Juan Sequeda > +1-575-SEQ-UEDA > www.juansequeda.com
Received on Tuesday, 17 May 2011 18:16:20 UTC