Re: DM+R2RML implementation feedback: DM cannot be implemented as an R2RML mapping

Hi Eric,

On 24 Apr 2012, at 14:49, Eric Prud'hommeaux wrote:
>> The justification for the non-standard escaping mechanism in the Direct Mapping was something about being able to abbreviate more URIs as prefixed names in legacy versions of SPARQL and Turtle (it's a non-issue for the most recent drafts of these specs).
> The issue is parsability of the generated IRI. '.'s serve as a separator character between pairs of attribute values, e.g. <table/attr1-value1.attr2-value2>.

Right  I'm (again!) questioning the choice of the period character as a separator.

> How does R2RML enable the construction of IRIs which can be parsed without ambiguity?

It says, about rr:templates:

If a template contains multiple pairs of unescaped curly braces, then any pair SHOULD be separated from the next one by a character or string that does not occur anywhere in the data values of either referenced column.

(It's a SHOULD because there are use cases where you don't care about unambiguous parseability, e.g., when you know that you only want to dump the DB.)

There's room for improvement in this sentence. When a template generates IRIs, then the values inserted into the template are first %-encoded. This means that any single non-IRI-safe character that is legal in IRIs is a safe separator, because it will be escaped if occurring in a value. The sentence should point that out. In particular, all the RFC 3987 sub-delimiters are safe separators:

   sub-delims     = "!" / "$" / "&" / "'" / "(" / ")"
                  / "*" / "+" / "," / ";" / "="

I suggest that one of them should be chosen as a field separator in the DM, to ensure that DMs can be expressed in R2RML, and R2RML processors can be used as DM implementations without requiring extensions.

Is the choice of . as separator in the DB still sensible, given recent changes to the grammar of Turtle and SPARQL 1.1?


> The related issue is 67 <>.
>> I do not believe that usability concerns only applying to old versions of these specs are more important than maintaining compatibility with RFC 3987 and between the different specs produced by this WG and hence propose that the DM uses an escaping mechanism that is compatible with R2RML and RFC 3987.
>> Best,
>> Richard
> -- 
> -ericP

Received on Tuesday, 24 April 2012 20:55:19 UTC