- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Sun, 13 Mar 2005 22:18:07 -0500
- To: Arjohn Kampman <arjohn.kampman@aduna.biz>
- Cc: public-rdf-dawg-comments@w3.org
- Message-ID: <20050314031807.GA10717@w3.org>
Clarification and notes -- this response was not considered by the DAWG:
On Thu, Feb 03, 2005 at 04:10:58PM +0100, Arjohn Kampman wrote:
>
> Dear all,
>
> The SPARQL Protocol as described at [1] suggests that SPARQL queries are
> going to be sent over the line as simple www-urlencoded strings. I would
> like to point out that we have tried this approach in Sesame and that it
> fails to handle multi-byte characters properly [2]. Main reason for this
> is that the used %xx patterns cannot encode any byte values larger than
> 255.
>
> In Sesame, we "solved" this issue by switching to multipart/form-data
> encoded POST requests.
I presume you are using the charset parameter
[[ [2388]
Each part of a multipart/form-data is supposed to have a content-
type. In the case where a field element is text, the charset
parameter for the text indicates the character encoding used.
]]
and that the clients tend to encoding the characters in charsets that
the servers tend to understand.
I phrase it this way because I'm looking at the trade-offs between:
- transaction-specified encoding.
- transaction-specified encoding with manditory support for at
least one common encoding.
- fixed-encoding (eg. utf-8), the only one used by the protocol.
What encodings do you RDQL servers support?
noting related RFCs ('cause I need to write it down somewhere):
[2045] MIME Part One: Format of Internet Message Bodies:
transfer encodings interacting with character encodings.
[2046] MIME Part Two: Media Types
4.1.2. Charset Parameter
5.1. Multipart Media Type
[2388] Returning Values from Forms: multipart/form-data
4.5 Charset of text in form data
> Main drawback of this solution is that we use
> POST-requests all the time, even when GET-requests would be more
> natural.
The DAWG's Use Cases and Requirements [UC&R] has Addressable Query
Results as a design objective. This was motivated by a TAG finding [GET].
[[
"Use GET if:
* The interaction is more like a question (i.e., it is a safe
operation such as a query, read operation, or lookup)."
]]
> Another option would be to enforce an UTF-8 characters-to-
> octets mapping to the query before adding it as a parameter value.
We could also include the charset in the GET, but I'm hoping that the
simplest approach (which I take to be fixed-encoding) will suffice.
> Hope you can use this feedback to improve the protocol.
>
> Regards,
>
> Arjohn Kampman
>
>
> [1] http://www.w3.org/TR/rdf-sparql-protocol/
> [2] http://www.openrdf.org/issues/secure/ViewIssue.jspa?key=SES-84
[2045] http://www.faqs.org/rfcs/rfc2045.html
[2046] http://www.faqs.org/rfcs/rfc2046.html
[2388] http://www.faqs.org/rfcs/rfc2388.html
[UC&R] http://www.w3.org/TR/2004/WD-rdf-dawg-uc-20041012/
[GET] http://www.w3.org/2001/tag/doc/whenToUseGet.html
--
-eric
office: +81.466.49.1170 W3C, Keio Research Institute at SFC,
Shonan Fujisawa Campus, Keio University,
5322 Endo, Fujisawa, Kanagawa 252-8520
JAPAN
+1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA
cell: +81.90.6533.3882
(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.
Received on Monday, 14 March 2005 03:18:07 UTC