W3C home > Mailing lists > Public > public-rdb2rdf-comments@w3.org > November 2011

Re: PLUS SIGN character in value of a pkey column

From: Richard Cyganiak <richard@cyganiak.de>
Date: Wed, 2 Nov 2011 14:54:23 +0000
Cc: Public-Rdb2rdf-Wg <public-rdb2rdf-wg@w3.org>, public-rdb2rdf-comments@w3.org
Message-Id: <390C26B3-4281-41D9-98CE-C4612F486ECD@cyganiak.de>
To: Souripriya Das <souripriya.das@oracle.com>
On 1 Nov 2011, at 18:33, Souripriya Das wrote:
> In Section 3 [1] of Direct Mapping LC Working draft (reproduced below), do we need to replace PLUS SIGN character in the value of a key column with its percent encoding?
> 
> ---------------
> Definition percent-encode: (a subset of HTML5 form dataset encoding):
> 
>    * Replace each PERCENT SIGN character ('%', U+0025) with the string "%25".
>    * For table names, replace each NUMBER SIGN character ('#', U+0023) with the string "%23".
>    * For table names, replace each SOLIDUS character ('/', U+002f) with the string "%2f".
>    * For attribute names, replace each HYPHEN-MINUS character ('-', U+003d) with the string "%3D".
>    * For attribute values, replace each FULL STOP character ('.', U+002e) with the string "%2E".
>    * Replace each SPACE character (U+0020) with the PLUS SIGN character (+, U+002B).
> -----------------
> 
> [1] http://www.w3.org/TR/rdb-direct-mapping/#definition

Hm, this definition has at least two problems:

1. It is lossy, because  as you correctly note  the strings   and + end up with the same encoded representation, making it impossible to reconstruct the original string.

2. It doesn't escape many characters that are forbidden in IRIs, making the results potentially violate the IRI (and RDF) specs.

To me it's also not clear why HTML form-encoding is used here instead of %-encoding as defined in RFC 3986. In other words, why are space characters encoded as + and not as %20? Form-encoding would clearly be appropriate if the URIs had the form ...?foo=this&bar=that, but since this is not the case, normal %-encoding seems to make more sense.

Best,
Richard
Received on Wednesday, 2 November 2011 14:55:01 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 2 November 2011 14:55:01 GMT