Re: Addressing ISSUE-47 (invalid and relative IRIs)

Richard - Thanks for writing this up. It seems like we are going in a useful
direction.

On Tue, Jul 12, 2011 at 4:06 PM, Richard Cyganiak <richard@cyganiak.de>wrote

> [[
> A DATA ERROR is a condition of the data in the input database that would
> lead to the generation of an invalid RDF term, such as an invalid IRI or an
> ill-typed literal.
>
> When providing access to the output dataset, an R2RML processor MUST abort
> any operation that requires inspecting or returning an RDF term whose
> generation would give rise to a data error, and report an error to the agent
> invoking the operation. A conforming R2RML processor MAY however allow other
> operations that do not require inspecting or returning these RDF terms, and
> thus MAY provide partial access to an output dataset that contains data
> errors. Nevertheless, an R2RML processor SHOULD report data errors as early
> as possible.
>
> The presence of data errors does not make an R2RML mapping non-conforming.
>
> Informative note: Data errors cannot generally be detected by analyzing the
> table schema of the database, but only by scanning the data in the tables.
> For large and rapidly changing databases, this can be an expensive or even
> impossible operation. Therefore, R2RML processors are allowed to answer
> queries that do not “touch” a data error, and the behavior of such
> operations is well-defined. For the same reason, the conformance of R2RML
> mappings is defined regardless of the presence of data errors.
>

This all sounds good.


> A R2RML DATA VALIDATOR is a system that takes as its input an R2RML
> mapping, a base IRI, and a SQL connection to an input database, and checks
> for presence of data errors. When checking the input database, a data
> validator MUST report any DATA ERRORS that are raised in the process of
> generating the output dataset.
>

It seems to me that providing a DATA VALIDATOR would be optional for an
R2RML implementor.


> If the value generated from a term map with term type “IRI” is not a valid
> IRI, then a DATA ERROR is generated.
>
> If the value generated from a term map with term type “Literal” is a typed
> literal whose datatype IRI is a supported datatype, and whose lexical form
> is not in the lexical space of the datatype, then a DATA ERROR is generated.
> ]]


I think it would make sense to list all of the errors in one place and then
state that those errors can occur when either the R2RML processor or the
DATA VALIDATOR is running.

-David

Received on Wednesday, 13 July 2011 14:43:09 UTC