Re: [db2triples] Implementation of R2RML and Direct Mapping : questions and comments

Dear Julien

Thanks for your R2RML implementation, db2triples!

On Apr 9, 2012, at 10:42 PM, Julien Homo wrote:

> Dear RDB2RDF's Working Group's members,
> 
> I'm currently working on the new version of db2triples which is an implementation registred with the W3C of the two standards R2RML and Direct Mapping.
> 
> This version supports the two W3C Candidate Recommendation of 23 February 2012 and I'm trying to test it using W3C official test cases.
> A vast majority of them are valited on my local machine and I want to send you my technical report in accordance with procedure described at http://www.w3.org/2001/sw/rdb2rdf/wiki/Submitting_Test_Results.
> 
> However, I have some difficulties particularly for Phase 4 "Run the TH software against your Test Results". I know the TH software is available only for the DirectMapping but it doesn't work on my DM tests either. 
> My ts.ttl file is modified with the location where the Test Suite with test results are stored. 
> In this folder, all the subdirectories (with name like "DXXX-xxx") contains a TURTLE file directGraph-db2triples.ttl but I have this error message :
> 
> "java.lang.NullPointerException
>     at org.rdb2rdf.testcase.th.TCScanner.scan(TCScanner.java:128)
>     at org.rdb2rdf.testcase.th.TCScanner.main(TCScanner.java:199)
> java.lang.NullPointerException
>     at org.rdb2rdf.testcase.th.model.RDB2RDFTC.saveEarlModel(RDB2RDFTC.java:332)
>     at org.rdb2rdf.testcase.th.TCScanner.scan(TCScanner.java:143)
>     at org.rdb2rdf.testcase.th.TCScanner.main(TCScanner.java:199)"
> 
> Have you any idea about this problem ? 

It seems to be there is a problem with the folder  you provided in the ts.ttl file. I updated the library [1], now it shows the folder you are working with if there's an error, would you please try it again? Also, it will be good if you can send me your folder zipped.


> 
> About my tests and the Test Cases, I have executed all SQL scripts into a MySQL and a PostgreSQL database and I have some questions and comments too : 
> 
> - My R2RMLTC0009c and R2RMLTC0009d tests fail because mapping was successfull with unamed column on the two dbms : why any columns in the SELECT list derived by projecting an expression like an expression with keyword COUNT must be named ? Is it to conform to Core SQL 2008 ?

This is what the R2RML spec [2]  says:  "…. The result of the query execution must not have duplicate column names. Any columns in the SELECT list derived by projecting an expression must be named …."


>  Besides in the second test, an implicit integer datatype is associated with SPORTCOUNT column but this is not the case in your result. A simple cast to string is required for this column ? 

This is something I have to check …. we'll discuss this within the WG.

> 
> - R2RMLTC00016e and DirectGraphTC0016 tests fail :  I have to modify SQL input file in order to postgreSQL does not raise a syntax error. Indeed, on the one hand BINARY VARYING is an unknown datatype for postgreSQL (bytea data type allows storage of binary strings, http://www.postgresql.org/docs/9.1/static/datatype-binary.html). On the other hand, encoding characters like "\ux2F" instead of "/" are not recognized. Is necessarily strictly adhere to the syntax of the SQL query to validate this test or these requests can be adapted ?

We'll discuss this within the WG, AFAIR it is possible to adapt the SQL queries according to particular DBMS.

> 
> - Finally I spotted some misprints in the Test Cases :
> 
> * R2RMLTC0014c : in the expected result, the datatypes of these generated literals have to be switched :
> 
> <http://example.com/emp/7369> <http://example.com/emp#deptNum> "10"^^<http://www.w3.org/2001/XMLSchema#positiveInteger>
> <http://example.com/emp/7369> <http://example.com/dept#deptno> "10"^^<http://www.w3.org/2001/XMLSchema#integer>

I'm going to fix this … thanks!

> 
> * R2RMLTC0016b and DirectGraphTC0016 : You use canonical RDF lexical form for double datatypes like "80.25E0" but it's "8.025E1" that appears in DirectGraphTC0016. R2RML CR indicates the choice of lexical form is implementation-dependent but my test fails because these results are not homogeneous. Can you confirm this expected result ?

We'll discuss this within the WG.

> 
> * R2RMLTC0014b : the inverseExpression contains a delimiter idenfier with double quotes "deptId" whereas no quotes are required (and RDF parser crashes..).

I'm going to fix this … thanks!

> 
> * R2RMLTC00016e : IRI built from binary data seem to be not base64 encoded ("<data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P5//6/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg==>") contrary to the DirectGraphTC0016 test.

I'm going to fix this … thanks!

> 
> Thank you very much for the help you can give me , I am available for any other question.

Thank you very much for your comments.  I'm going to fix the misprints in the TCs.

We will discuss the rest of the issues within the Working Group and get back to you. 


Thanks again

Boris


[1] http://mccarthy.dia.fi.upm.es/rdb2rdf/tc/th/rdb2rdf-th.zip
[2] http://www.w3.org/2001/sw/rdb2rdf/r2rml/#dfn-sql-query

> 
> Best regards,
> 
> Julien Homo
> -- 
> ____________________________________________
> 
> Julien Homo ( @julien_homo) - Antidot 
> Development Engineer / Technical consultant
> Mail : jhomo@antidot.net - Phone : (+33 / 0)4.72.76.31.45
> ____________________________________________
> 

Received on Wednesday, 11 April 2012 11:19:51 UTC