W3C home > Mailing lists > Public > public-rdf-dawg-comments@w3.org > March 2005

Re: SPARQL Protocol and Unicode characters

From: Dan Connolly <connolly@w3.org>
Date: Mon, 14 Mar 2005 12:28:57 -0600
To: public-rdf-dawg-comments@w3.org, Eric Prud'hommeaux <eric@w3.org>
Cc: Arjohn Kampman <arjohn.kampman@aduna.biz>
Message-Id: <1110824937.18619.193.camel@localhost>

Eric, the draft I put together a week or so handled this...
or rather, I found the relevant part of XForms that handles
it...

"Encoding Queries
A service target IRI[IRI] t is combined with a SPARQL query string qs
using the conventional encoding, using query as the parameter name. The
exact algorithm for encoding the query is given in section 11.6
Serialization as application/x-www-form-urlencoded of [XForms] and the
algorithm for combining the encoded parameters with t is given in
section 11.9 The get Submit Method.
"
 -- http://www.w3.org/2001/sw/DataAccess/prot26

On Thu, Feb 03, 2005 at 04:10:58PM +0100, Arjohn Kampman wrote:
> Dear all,
> 
> The SPARQL Protocol as described at [1] suggests that SPARQL queries are 
> going to be sent over the line as simple www-urlencoded strings. I would
> like to point out that we have tried this approach in Sesame and that it
> fails to handle multi-byte characters properly [2]. Main reason for this
> is that the used %xx patterns cannot encode any byte values larger than
> 255.

The algorithm is, in a nutshell:
  1. encode in utf-8
  2. %xx-lify

So it does handle all unicode characters.

-- 
Dan Connolly, W3C http://www.w3.org/People/Connolly/
D3C2 887B 0F92 6005 C541  0875 0F91 96DE 6E52 C29E
Received on Monday, 14 March 2005 18:28:58 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:14:48 GMT