- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Thu, 9 Jun 2011 15:47:39 -0700
- To: Richard Cyganiak <richard@cyganiak.de>
- Cc: David McNeil <dmcneil@revelytix.com>, RDB2RDF WG <public-rdb2rdf-wg@w3.org>
* Richard Cyganiak <richard@cyganiak.de> [2011-06-09 21:08+0100]
> On 9 Jun 2011, at 17:24, Eric Prud'hommeaux wrote:
> > * Richard Cyganiak <richard@cyganiak.de> [2011-06-09 13:43+0100]
> >> Hi David,
> >>
> >> There's a wiki page that talks quite a bit about this.
> >> http://www.w3.org/2001/sw/rdb2rdf/wiki/Identifier_re-use#Entity_alignment
> >>
> >> It also describes D2RQ's capabilities in this regard, which I would like to see added to R2RML.
> >>
> >> In particular, a number of template functions:
> >>
> >> rr:template "http://example.com/people/{NAME|urlencode}";
> >>
> >> and defining mapping tables as part of the mapping.
> >
> > I note the usual tension between readable and reversible.
> > With |urlencode,
> > SELECT ?dob { <http://example.com/people/John%20Smith> <HR#dob> ?dob }
> > can be trivially truned into
> > SELECT HR.dob FROM HR WHERE HR.name="John Smith"
> > , which can exploit an index. Conversely, |urlify's
> > SELECT ?dob { <http://example.com/people/John_Smith> <HR#dob> ?dob }
> > is harder to reliably reverse (unless '_'s in native data are doubled):
> > SELECT HR.dob FROM HR WHERE urlify(HR.name)="http://example.com/people/John_Smith"
>
> Spaces are converted to “_”, underscores and all other non-URI characters in the DB are %-encoded. This is reversible.
ahh, same as urlencode, but using '_' in place of '+'.
yep, reversible, but a little hard on non-ascii scripts.
i guess such a function can:
target concatonations with only one field of arbitrary text.
or
predict likely delimiters and escape them.
or
parameterize the escape characters.
> Best,
> Richard
--
-ericP
Received on Thursday, 9 June 2011 22:48:09 UTC