- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Thu, 9 Jun 2011 15:47:39 -0700
- To: Richard Cyganiak <richard@cyganiak.de>
- Cc: David McNeil <dmcneil@revelytix.com>, RDB2RDF WG <public-rdb2rdf-wg@w3.org>
* Richard Cyganiak <richard@cyganiak.de> [2011-06-09 21:08+0100] > On 9 Jun 2011, at 17:24, Eric Prud'hommeaux wrote: > > * Richard Cyganiak <richard@cyganiak.de> [2011-06-09 13:43+0100] > >> Hi David, > >> > >> There's a wiki page that talks quite a bit about this. > >> http://www.w3.org/2001/sw/rdb2rdf/wiki/Identifier_re-use#Entity_alignment > >> > >> It also describes D2RQ's capabilities in this regard, which I would like to see added to R2RML. > >> > >> In particular, a number of template functions: > >> > >> rr:template "http://example.com/people/{NAME|urlencode}"; > >> > >> and defining mapping tables as part of the mapping. > > > > I note the usual tension between readable and reversible. > > With |urlencode, > > SELECT ?dob { <http://example.com/people/John%20Smith> <HR#dob> ?dob } > > can be trivially truned into > > SELECT HR.dob FROM HR WHERE HR.name="John Smith" > > , which can exploit an index. Conversely, |urlify's > > SELECT ?dob { <http://example.com/people/John_Smith> <HR#dob> ?dob } > > is harder to reliably reverse (unless '_'s in native data are doubled): > > SELECT HR.dob FROM HR WHERE urlify(HR.name)="http://example.com/people/John_Smith" > > Spaces are converted to “_”, underscores and all other non-URI characters in the DB are %-encoded. This is reversible. ahh, same as urlencode, but using '_' in place of '+'. yep, reversible, but a little hard on non-ascii scripts. i guess such a function can: target concatonations with only one field of arbitrary text. or predict likely delimiters and escape them. or parameterize the escape characters. > Best, > Richard -- -ericP
Received on Thursday, 9 June 2011 22:48:09 UTC