On relative IRIs and rr:template

In the call we discussed what happens when templates produce relative IRIs. So imagine a table:

  CREATE TABLE STUDENT (NAME VARCHAR(50));
  INSERT INTO STUDENT VALUES ("http://company.com/Alice");
  INSERT INTO STUDENT VALUES ("Bob");
  INSERT INTO STUDENT VALUES ("Bob/Charles");
  INSERT INTO STUDENT VALUES ("path/../Danny)";
  INSERT INTO STUDENT VALUES ("Emily Smith");

With a mapping that has:

  rr:template "{NAME}"; rr:termType rr:IRI;

and we run this with a base URI of

  <http://example.com/base/>

For the three different values, this would produce the following IRIs:

  <http://example.com/base/http%3A%2F%2Fcompany.com%2FAlice>
  <http://example.com/base/Bob>
  <http://example.com/base/Bob%2FCharles>
  <http://example.com/base/path%2F..%2FDanny>
  <http://example.com/base/Emily%20Smith>

Note that first %-encoding is applied, and then base IRI resolution happens if the result of the %-encoding is not a valid IRI (in other words, always).


On the other hand, if we had a mapping that uses rr:column instead of rr:template:

  rr:column "NAME"; rr:termType rr:IRI;

then we'd get:

  <http://company.com/Alice>
  <http://example.com/base/Bob>
  <http://example.com/base/Bob/Charles>
  <http://example.com/base/path/../Danny>
  (data error)

So, no %-encoding is applied to the column value in this case, and the base IRI is *not* prepended if the column already contains a valid IRI. But since we don't do %-encoding, the result can be an invalid IRI, leading to data errors whenever any of these values would have to be returned in a query.

Both for rr:template and rr:column, base IRI resolution is simple concatenation with the base IRI. Stuff like "../" doesn't get resolved.

Maybe the examples above are useful for creating additional test cases?


The details are in 7.3 and 11.2.

Section 7.3 explains the “template value” that you get from instantiating a template:

[[
The template value of the term map for a given logical table row is determined as follows:

1. Let result be the template string
2. For each pair of unescaped curly braces in result:
   1. Let value be the data value of the column whose name is enclosed in the curly braces
   2. If value is NULL, then return NULL
   3. Let value be the natural RDF lexical form corresponding to value
   4. If the term type is rr:IRI, then replace the pair of
      curly braces with an IRI-safe version of value; otherwise,
      replace the pair of curly braces with value
3. Return result
]]

Note that it uses the “IRI-safe” version when producing IRIs. This refers to %-encoding.

Section 11.2 explains how RDF terms are generated from the values produced in term maps:
http://www.w3.org/2001/sw/rdb2rdf/r2rml/#generated-rdf-term

The relevant details here are:

[[
• If the term map is a column-valued term map, then the generated RDF term
  is determined by applying the term generation rules to its column value.
• If the term map is a template-valued term map, then the generated RDF term
  is determined by applying the term generation rules to its template value.
]]

And:

[[
[I]f the term map's term type is rr:IRI:
   1. Let value be the natural RDF lexical form corresponding to value.
   2. If value is a valid absolute IRI [RFC3987], then return an
      IRI generated from value.
   3. Otherwise, prepend value with the base IRI. If the result is a
      valid absolute IRI [RFC3987], then return an IRI generated from
      the result.
   4. Otherwise, raise a data error.
]]

Best,
Richard

Received on Tuesday, 6 March 2012 18:44:54 UTC