- From: Richard Cyganiak <richard@cyganiak.de>
- Date: Thu, 7 Jul 2011 18:57:11 +0100
- To: David McNeil <dmcneil@revelytix.com>
- Cc: RDB2RDF WG <public-rdb2rdf-wg@w3.org>
Hi David, Thanks for the comments. On 7 Jul 2011, at 16:26, David McNeil wrote: > On Mon, Jul 4, 2011 at 6:14 PM, Richard Cyganiak <richard@cyganiak.de> wrote: >> 3. Invalid IRIs (e.g., anything containing spaces and so on) are skipped, and if any triple would include such an IRI then that triple is skipped >> > > This worries me. I am uncomfortable with rows of data silently disappearing based on their contents. The question is, what's the alternative. The only workable other option I can think of is to make this an error. I don't like that option much, because this error cannot be determined based on the schema, and cannot be detected at startup time but only at query time. So it would be a “runtime error”, a new concept. I'd prefer if the validity of a mapping depended only on the schema of a DB, but not on its contents, so that you could do validation at startup time. (Thinking more about it, another reasonable option might be to make it a blank node.) We'll run into similar questions with typed literals. What should happen when the mapping produces "aaa"^xsd:integer? I'm tempted to say that these should also simply not be generated. >> 4. rr:template is changed so that it %-encodes most characters. This means that rr:column "person/{NAME}" will work even if the name contains spaces, the result will be "http://base.uri/person/Alice%20Smith" >> > > A couple of thoughts on this: > > * I don't think we yet had group consensus that R2RML should perform automatic %-encoding. That's right. I can't remember the issue being discussed? > * I think a consequence of what you are proposing is that the following two R2RML snippets would behave differently with respect to encoding: > rr:subject [ rr:column "Name" ] > rr:subject [ rr:template "{Name}" ] > > I think it would be less surprising to users if these two constructs had the same behavior. You are right, it's a bit surprising, but I think it's easy enough to learn and remember that "{name}" performs %-encoding when generating IRIs while "name" doesn't. I don't see how we could reasonably make both behave the same. Both can't be %-encoding, because then rr:column would %-encode already valid IRIs, making them invalid. If both are non-%-encoding, then we need some other mechanism for %-encoding, so we'd need a proposal for that. And to be honest, I believe that rr:column will be rarely used for generating IRIs. If you'd prefer to see some other behaviour here, could you please open an issue (or make a change proposal)? Best, Richard
Received on Thursday, 7 July 2011 17:57:39 UTC