Re: Addressing ISSUE-47 (invalid and relative IRIs)


Having thought more about this, I think a better option might be to state
> that a conforming R2RML processor MUST check for data errors at startup
> time.

Since the spec is generally agnostic about batch vs on-demand processing,
why would the data validation be different? What I mean is why would we
mandate that the validation be performed at startup?

For what it's worth, the applications that our product is being used in
generally preclude checking all of the data for errors on startup. This
would mean that our usages would generally be non-conforming. Which seems a
bit concerning, but maybe there would be no practical consequence of this?

> I'm not vehemently opposed to this, although I'd strongly prefer if the
> default were to do the encoding.

That seems fine to me.

> And I'd strongly prefer if the other user choice would be to just *reduce*
> the set of characters that would be encoded, similar to URI templates,
> rather than turning it off completely. I don't see the use case for turning
> it off completely.

Maybe I just don't understand the details of the encoding well enough, but I
am not sure what reduced set of characters makes sense. For example if the
data in a column is "people/John%20Smith", would the reduced encoding pass
this data through unchanged?

> Now if you query for that IRI, the R2RML processor would have to figure out
> that this IRI could have been produced from the rr:template above, and would
> perform the reversal. First it would figure out that the "John%20Smith" part
> of the IRI could have come from {name}. Then it would percent-decode that,
> yielding "John Smith".

Thanks for the explanation, I think I understand now.


Received on Monday, 11 July 2011 18:27:54 UTC