Re: Addressing ISSUE-47 (invalid and relative IRIs)

On 11 Jul 2011, at 19:27, David McNeil wrote:
> Since the spec is generally agnostic about batch vs on-demand processing, why would the data validation be different? What I mean is why would we mandate that the validation be performed at startup?

Otherwise we penalise users for the negligence of sloppy mapping authors. The onus should really be on mapping authors to ensure that their mappings work with all data. At query time it's generally too late. I think throwing an error message at the user when the mapping author didn't properly escape their data is the least helpful option.

> For what it's worth, the applications that our product is being used in generally preclude checking all of the data for errors on startup.

How do you ensure that mappings work over the entire contents of the database? I mean you generally can't ensure that just by looking at the mapping and the schema. Do you just hope that the mapping author was careful?

>  This would mean that our usages would generally be non-conforming. Which seems a bit concerning, but maybe there would be no practical consequence of this?

Hard to say. I don't understand the use case where it's ok to fail at query time.

> Maybe I just don't understand the details of the encoding well enough, but I am not sure what reduced set of characters makes sense.

Me neither. This would require further research.

> For example if the data in a column is "people/John%20Smith", would the reduced encoding pass this data through unchanged? 

Yes, this would be the intention. "John Smith" on the other hand should still become "John%20Smith" if encoding with the reduced set of characters.

Best,
Richard

Received on Tuesday, 12 July 2011 12:21:11 UTC