Security issue in SPARQL string escaping in SPARQL grammar

In http://www.w3.org/TR/2011/WD-sparql11-query-20110512/#grammar it says that the SPARQL WG considers the following change:

> • The escape processing model for \u escapes changes to be an additional escape for like \" or \t, not a replacement done before grammar parsing.

I strongly support this change.

With the addition of SPARQL UPDATE, the current design becomes a serious security risk.

Consider the following query, which deletes the contents of a store:

  PREFIX : <> DELETE { ?s ?p ?o } #> SELECT { ?s ?p ?o }

This potentially harmful query can be obfuscated using string escapes, resulting in a harmless-looking query:

  PREFIX : <\u003E\u0020\u0044\u0045\u004C\u0045\u0054\u0045\u0020\u007B\u0020\u003F\u0073\u0020\u003F\u0070\u0020\u003F\u006F\u0020\u007D\u0020\u0023> SELECT { ?s ?p ?o }

The risk is that a) users can be tricked into running harmful queries, and b) software that uses heuristics to detect queries with potential security impact will be less likely to work.

This may have been ok in SPARQL 1.0, but with the addition of SPARQL UPDATE this is an unacceptable risk.

I am surprised that the security issues arising from obfuscation through string escaping are not stated in the Security Considerations sections of SPARQL Query and SPARQL Update.

The WG also considers the following change:

> • As part of the changes to the escape processing model for \u escapes, additional characters (e.g. "=", ",") would be allowed, in \u escaped form, in prefixed names.

I oppose this change, as there is no use case for it. Prefixed names are a convenience for authors to make long IRIs easier to write and read. Escapes like \u003D and \u002C are neither easy to write nor easy to read, so they defeat the purpose of prefixed names. IRIs that include such characters just have to be written as absolute or relative IRIs.

Best,
Richard

Received on Monday, 15 August 2011 10:25:07 UTC