Re: SPARQL Protocol and Unicode characters

Eric, the draft I put together a week or so handled this...
or rather, I found the relevant part of XForms that handles
it...

"Encoding Queries
A service target IRI[IRI] t is combined with a SPARQL query string qs
using the conventional encoding, using query as the parameter name. The
exact algorithm for encoding the query is given in section 11.6
Serialization as application/x-www-form-urlencoded of [XForms] and the
algorithm for combining the encoded parameters with t is given in
section 11.9 The get Submit Method.
"
 -- http://www.w3.org/2001/sw/DataAccess/prot26

On Thu, Feb 03, 2005 at 04:10:58PM +0100, Arjohn Kampman wrote:
> Dear all,
> 
> The SPARQL Protocol as described at [1] suggests that SPARQL queries are 
> going to be sent over the line as simple www-urlencoded strings. I would
> like to point out that we have tried this approach in Sesame and that it
> fails to handle multi-byte characters properly [2]. Main reason for this
> is that the used %xx patterns cannot encode any byte values larger than
> 255.

The algorithm is, in a nutshell:
  1. encode in utf-8
  2. %xx-lify

So it does handle all unicode characters.

-- 
Dan Connolly, W3C http://www.w3.org/People/Connolly/
D3C2 887B 0F92 6005 C541  0875 0F91 96DE 6E52 C29E

Received on Monday, 14 March 2005 18:28:58 UTC