- From: David McNeil <dmcneil@revelytix.com>
- Date: Fri, 8 Jul 2011 08:30:16 -0500
- To: Richard Cyganiak <richard@cyganiak.de>
- Cc: RDB2RDF WG <public-rdb2rdf-wg@w3.org>
- Message-ID: <CA+8VvdxBP9p9M1mFrFY0kbhN0C4rbjeNtw1PYjgZzn6rBqeZwA@mail.gmail.com>
Richard - I thought about these questions more and my thoughts are inline below. On Thu, Jul 7, 2011 at 12:57 PM, Richard Cyganiak <richard@cyganiak.de>wrote: > On 7 Jul 2011, at 16:26, David McNeil wrote: > > On Mon, Jul 4, 2011 at 6:14 PM, Richard Cyganiak <richard@cyganiak.de> > wrote: > >> 3. Invalid IRIs (e.g., anything containing spaces and so on) are > skipped, and if any triple would include such an IRI then that triple is > skipped > >> > > > > This worries me. I am uncomfortable with rows of data silently > disappearing based on their contents. > > The question is, what's the alternative. The only workable other option I > can think of is to make this an error. > I see two different perspectives on the mapping issue. 1) a relatively casual user wants to expose a relational database as RDF and want it to "just work". I can see in this mode that it could make sense to just silently ignore rows that might cause trouble (e.g. rows with null values or rows that produce IRIs with spaces, or rows that produce text values that claim to be numbers. 2) a software developer building an application that includes mapping a relational database to RDF. In this mode I think it is very troublesome for rows to just silently disappear from the output. This is like software silently swallowing exceptions (typically a bad practice that makes debugging much more difficult). Because of my background and the product I am working on I am more concerned with the second use case. Driven by this I would say that for ISSUE-47 and ISSUE-51 the R2RML implementation should simply generate these triples and pass them downstream. This thinking also causes me to reconsider silently suppressing rows null values in template expressions. > >> 4. rr:template is changed so that it %-encodes most characters. This > means that rr:column "person/{NAME}" will work even if the name contains > spaces, the result will be "http://base.uri/person/Alice%20Smith" > >> > > > > * I think a consequence of what you are proposing is that the following > two R2RML snippets would behave differently with respect to encoding: > > rr:subject [ rr:column "Name" ] > > rr:subject [ rr:template "{Name}" ] > > > > I think it would be less surprising to users if these two constructs had > the same behavior. > > You are right, it's a bit surprising, but I think it's easy enough to learn > and remember that "{name}" performs %-encoding when generating IRIs while > "name" doesn't. > > I don't see how we could reasonably make both behave the same. Both can't > be %-encoding, because then rr:column would %-encode already valid IRIs, > making them invalid. If both are non-%-encoding, then we need some other > mechanism for %-encoding, so we'd need a proposal for that. > What if we made the %-encoding optional in templates? So for example this would not perform %-encoding: rr:template "{Name}" But this would: rr:template "{%Name}" On the last telecon we discussed defining functions for the user to invoke to perform %-encoding but there was some concern about making it more difficult to parse the mapping. Using a solution like {%Name} seems like it would accomplish the objective without making the templates significantly harder to parse. On a related issue, I think that as we introduce %-encoding on the mapping side we need to define how the inverse operation is performed in inverse expressions. -David
Received on Friday, 8 July 2011 13:30:45 UTC