Re: Security issue in SPARQL string escaping in SPARQL grammar from Richard Cyganiak on 2011-12-05 (public-rdf-dawg-comments@w3.org from December 2011)

From: Richard Cyganiak <richard@cyganiak.de>
Date: Mon, 5 Dec 2011 23:12:39 +0000
To: Paul Gearon <gearon@ieee.org>
Cc: public-rdf-dawg-comments@w3.org
Message-Id: <702B4C03-0D29-4050-A851-7EC3CEC51188@cyganiak.de>
Hi Paul, hi Andy,

My comment has not been addressed. See below.

On 5 Dec 2011, at 21:42, Paul Gearon wrote:
>>> • The escape processing model for \u escapes changes to be an additional escape for like \" or \t, not
>>> a replacement done before grammar parsing.
>> 
>> I strongly support this change.
> 
> Thank you for saying this. Positive and negative feedback are both useful.
> 
> 
>> With the addition of SPARQL UPDATE, the current design becomes a serious security risk.
>> 
>> Consider the following query, which deletes the contents of a store:
>> 
>> PREFIX : <> DELETE { ?s ?p ?o } #> SELECT { ?s ?p ?o }
>> 
>> This potentially harmful query can be obfuscated using string escapes, resulting in a harmless-looking
>> query:
>> 
>> PREFIX : <\u003E\u0020\u0044\u0045\u004C\u0045\u0054\u0045\u0020\u007B\u0020\u003F\u0073\u0020\u003F\u0070\u0020\u003F\u006F\u0020\u007D\u0020\u0023> SELECT { ?s ?p ?o }
>> 
>> The risk is that a) users can be tricked into running harmful queries, and b) software that uses
>> heuristics to detect queries with potential security impact will be less likely to work.
>> 
>> This may have been ok in SPARQL 1.0, but with the addition of SPARQL UPDATE this is an
>> unacceptable risk.
>> 
>> I am surprised that the security issues arising from obfuscation through string escaping are not stated
>> in the Security Considerations sections of SPARQL Query and SPARQL Update.
> 
> The SPARQL query language and SPARQL update language are separate
> languages. While they share a lot, the top level productions for query
> does not include update commands and the top level production for
> update does not include query.
> 
> Therefore it is not a security consideration for SPARQL Query because
> DELETE, INSERT etc are not part of the language at all.
> 
> Security is one reason for making this distinction. A complete and
> compliant implementation of SPARQL Query offered at an endpoint will
> reject updates (whether this escape change is made or not) because
> they do not parse as queries. A service is free to offer both at the
> same endpoint but it is the service's responsibility and having two
> languages makes that clear. Common practice is to have different
> endpoints for query and update, which can have different security
> setups.
> 
> SPARQL Update does not return any results except the HTTP status code.

This comment does not address my concern at all. In fact, it misses the point.

The security issue arises not from the possibility of having Query and Update at the same endpoint. The security issue arises from the ability to obfuscate commands by masquerading a harmful command as a harmless one. Users can be tricked into running a harmless looking command that actually does something malicious. This is a problem no matter if the possible harmless-looking commands includes SELECT or only update commands.

The SPARQL 1.0 design, if applied to SPARQL Update, makes it possible for an attacker to trick a user into running an update command that corrupts or deletes the store, by making it look like a harmless maintenance command.

>> The WG also considers the following change:
>> 
>>> • As part of the changes to the escape processing model for \u escapes, additional characters (e.g.
>>> "=", ",") would be allowed, in \u escaped form, in prefixed names.
> 
> The RDF-WG has resolved to add character escapes to prefix names for
> characters that are allowed in IRs but not currently in the local part
> of a prefix name.
> 
> http://www.w3.org/2011/rdf-wg/meeting/2011-11-30#resolution_1
> 
> The set of character is ~.-!$&'()*+,;=:/?#@%_

I supported this change.

> The SPARQL-WG has decided to also add this (as an "at-risk" feature)
> to SPARQL 1.1,

I think this is a good decision.

> also to have unescaped %-endoced sequences,

I have no opinion on this one.

> and to
> leave the unicode escape processing model as it is in SPARQL 1.0.

I think this part is a very poor decision and I ask the SPARQL WG to reconsider.

This creates a major security issue in SPARQL Update.

It also goes against the stated goal of aligning Turtle and SPARQL. It is very unlikely that Turtle is going to change its current escape processing model, where escapes are only allowed in places where they syntactically make sense. For Turtle to adopt the SPARQL 1.0 model would introduce security issues into Turtle as well.

> While the use of unicode escape sequences is still possible, the WG
> believes it will not be commonly used. Authors do not need to use
> unicode escape sequences to get characters into the local part of a
> prefix name.

I agree with these statements. Given that the SPARQL WG expects unicode escape sequences to be rarely used, the WG should restrict the use of these sequences to places where they do not create a security risk and are arguably somewhat useful – that is, places where unicode characters are actually allowed, such as in string literals and IRIs. Just like in Turtle (or XML or JSON or SQL).

>> I oppose this change, as there is no use case for it. Prefixed names are a convenience for authors to
>> make long IRIs easier to write and read. Escapes like \u003D and \u002C are neither easy to write nor
>> easy to read, so they defeat the purpose of prefixed names. IRIs that include such characters just have
>> to be written as absolute or relative IRIs.

(For the record: With the addition of \-escapes to prefixed names, I no longer oppose unicode escapes in prefixed names, nor would I oppose them anywhere else where unicode characters are allowed in SPARQL. My concern was that the proposed design would encourage authoring of queries that are hard to read, because the only way to prefix-abbreviate certain URIs would have been by using unicode escapes. With the addition of \-escapes, authors now have a convenient and readable way of abbreviating these IRIs to prefixed names. As a result, there is no incentive to use unicode escapes in prefixed names, and hence allowing them does no harm. This entire question is a separate issue from the security question.)

> We would be grateful if you would acknowledge that your comment has
> been answered by sending a reply to this mailing list.

My comment has not been answered. You have not explained why the SPARQL WG believes that the ability to obfuscate commands in SPARQL Update is not a security concern.

Given that the SPARQL WG believes that unicode escapes will be rarely used anyways, there is no excuse for applying the SPARQL 1.0 design to SPARQL Update.

Best,
Richard



> 
> Andy and Paul (on behalf of the SPARQL WG)
Received on Monday, 5 December 2011 23:13:10 UTC