Detailed review of R2RML test cases, parts 0000-0006

Hi Boris,

I've started to read the test cases in detail. I have to say there's a lot of them, which is great :-) There's still a couple of problems though. I've read everything from 0000 to 0006a so far, will do the rest as I find the time. I'll start with some general comments (some of them already mentioned in the call today) and then comment on individual test cases.


== General comments ==

1. As said in the call, many test cases use rr:constant where it's unnecessary. Example:

  rr:predicateMap [ rr:constant ex:firstName ]; 

This can be written as:

  rr:predicate ex:firstName

which is better practice. Just have one test case that exercises the long form and use the short one by default.


2. Similar comment for the use of rr:datatype on all integer columns. This actually overrides the natural mapping, which is not good practice most of the time. Just drop the rr:datatype. It will produce the same result, as seen here:

http://www.w3.org/TR/r2rml/#natural-mapping


3. As said in the call, the base IRI is an additional input into the mapping process, besides the database. It is *not* the value of @base in the mapping file.


4. Test cases that are expected to fail should tell so explicitly in the test cases document. Currently they can be recognized only by the lack of specified RDF output. It could say: “Expected result: invalid R2RML mapping” or similar.


5. Personally I find that too many tests produce blank nodes, which is considered bad practice. I'd recommend just having a few that exercise this feature specifically. The default subject map that is used in most test cases should be a simple rr:template, like "students/{\"ID\"}".


== R2RMLTC0000 ==

Ok


== R2RMLTC0001 ==

Ok


== R2RMLTC0002a ==

This test case violates a SHOULD conformance criteria:
http://www.w3.org/2001/sw/rdb2rdf/r2rml/#dfn-string-template

[[
If a template contains multiple pairs of unescaped curly braces, then any pair SHOULD be separated from the next one by a character or string that does not occur anywhere in the data values of either referenced column.
]]

I recommend replacing the IRI template
    http://example.com/{"ID"}{"Name"}
with this one:
    http://example.com/{"ID"}/{"Name"}


== R2RMLTC0002b ==

The table name in the rr:sqlQuery needs to be double-quoted. (StudentId is not double-quoted either, but that's fine.)


== R2RMLTC0002c ==

See comment for R2RMLTC0002a. Recommend simplifying the test case to *just* exercise presence of unknown column. Recommend editorial changes – the SQL identifier isn't invalid, it's just undefined.


== R2RMLTC0002d ==

Seems redundant. What is this exercising? Seems to be covered by 2b.


== R2RMLTC0002e ==

See comment for R2RMLTC0002a. Recommend simplifying the test case to *just* exercise presence of unknown table. Recommend editorial changes – the SQL identifier isn't invalid, it's just undefined.


== R2RMLTC0002f ==

Description says something about schema-qualified names, but there are none in the R2RML?


== R2RMLTC0002g ==

Recommend simpler test case that just exercises invalid SQL query. Recommend making the SQL query more explicitly invalid, e.g., rr:sqlQuery "THIS IS NOT A VALID SQL QUERY"


== R2RMLTC0002h ==

Recommend simpler test case that just exercises duplicate column name in SELECT clause.


== R2RMLTC0002i ==

Recommend simpler test case that just exercises presence of rr:sqlVersion


== R2RMLTC0002j ==

Recommend simpler test case that just exercises qualified column names in SQL query. See comment for R2RMLTC0002a.


== R2RMLTC0003a ==

Description says “concatenation of two column values” but I can't see where two columns are concatenated in the R2RML. Also see comment for R2RMLTC0002a, use an rr:template that doesn't stick the substitution patterns directly together (or perhaps *just* use "ID" in the template).

Non-existing properties foaf:firstName and foaf:lastName – use ex: prefix instead


== R2RMLTC0003b ==

Recommend simpler test case to only exercise concatenation of columns to produce literal


== R2RMLTC0003c ==

Not sure what this one is exercising, as rr:termType doesn't do anything here, the default is rr:Literal anyways. Perhaps replace the first name and last name maps with as single map that uses
    rr:template "{\"FirstName\"} {\"LastName\"}"; rr:termType rr:Literal;
because this would exercise something new.

Non-existing properties foaf:firstName and foaf:lastName – use ex: prefix instead


== R2RMLTC0003d ==

Again this exercises just the defaults. Use rr:termType rr:IRI on a column-valued object map to exercise something new?


== R2RMLTC0004a ==

Ok


== R2RMLTC0004b ==

The subjects should be _:StudentVenus and _:SportTennis


== R2RMLTC0005a ==

Should produce only four triples – RDF graphs are sets and cannot logically contain the same triple multiple times, see last paragraph in 11.1:
http://www.w3.org/TR/r2rml/#generated-triples

The DDL definition D005-1table3columns3rows2duplicates uses FLOAT but the table above it uses DOUBLE.

See initial comments – drop the rr:datatype and just let the natural mapping take care of the typing. The output should be rr:double (the natural RDF datatype for any floating-point SQL type). The literal should be "3.0E1"^^xsd:double and "2.0E1"^^xsd:double because that's the canonical form. (Same for the equivalent Direct Mapping test case!)


== R2RMLTC0006a ==

What is this exercising? Doesn't seem to test anything that isn't already covered in the 0001 family of tests

Received on Tuesday, 6 March 2012 19:59:47 UTC