Re: Security issue in SPARQL string escaping in SPARQL grammar from Steve Harris on 2011-12-06 (public-rdf-dawg-comments@w3.org from December 2011)

From: Steve Harris <steve.harris@garlik.com>
Date: Tue, 6 Dec 2011 10:27:28 +0000
To: Paul Gearon <gearon@ieee.org>
Cc: Richard Cyganiak <richard@cyganiak.de>, public-rdf-dawg-comments@w3.org
Message-Id: <8940BA64-9998-43F2-84FD-38CAE5AF5951@garlik.com>
On 2011-12-05, at 21:42, Paul Gearon wrote:

> On Mon, Aug 15, 2011 at 6:24 AM, Richard Cyganiak <richard@cyganiak.de> wrote:
>> In http://www.w3.org/TR/2011/WD-sparql11-query-20110512/#grammar it says that the SPARQL WG
>> considers the following change:
>> 
>>> • The escape processing model for \u escapes changes to be an additional escape for like \" or \t, not
>>> a replacement done before grammar parsing.
>> 
>> I strongly support this change.
> 
> Thank you for saying this. Positive and negative feedback are both useful.
> 
> 
>> With the addition of SPARQL UPDATE, the current design becomes a serious security risk.
>> 
>> Consider the following query, which deletes the contents of a store:
>> 
>>  PREFIX : <> DELETE { ?s ?p ?o } #> SELECT { ?s ?p ?o }
>> 
>> This potentially harmful query can be obfuscated using string escapes, resulting in a harmless-looking
>> query:
>> 
>>  PREFIX : <\u003E\u0020\u0044\u0045\u004C\u0045\u0054\u0045\u0020\u007B\u0020\u003F\u0073\u0020\u003F\u0070\u0020\u003F\u006F\u0020\u007D\u0020\u0023> SELECT { ?s ?p ?o }
>> 
>> The risk is that a) users can be tricked into running harmful queries, and b) software that uses
>> heuristics to detect queries with potential security impact will be less likely to work.
>> 
>> This may have been ok in SPARQL 1.0, but with the addition of SPARQL UPDATE this is an
>> unacceptable risk.
>> 
>> I am surprised that the security issues arising from obfuscation through string escaping are not stated
>> in the Security Considerations sections of SPARQL Query and SPARQL Update.
> 
> The SPARQL query language and SPARQL update language are separate
> languages. While they share a lot, the top level productions for query
> does not include update commands and the top level production for
> update does not include query.
> 
> Therefore it is not a security consideration for SPARQL Query because
> DELETE, INSERT etc are not part of the language at all.

[this is Garlik's view, not an official response]

This is a common, but extremely dangerous misconception.

Security considerations aren't just about changing the contents of the database, it's also an issue with being able to make the SPARQL endpoint perform requests on behalf of the attacker.

SERVICE makes this obvious/easy in SPARQL 1.1, but even 1.0's FROM is a possible source of attacks.

For example:

SELECT *
FROM <http://host1/something/only/DMZ/can/access>
FROM <http://host2/something/huge>
FROM <http://host3/some/REST/service>
FROM <http://host4/...>
FROM <http://host5/...>
FROM <http://host6/...>
WHERE { ?x ?y ?z }

Will cause some SPARQL endpoints to issue 6 HTTP requests *from inside the host's security zone* - this is both an escalation attack (1 request triggers 6), and also an indirection attack.

If clients requested the FROM'd documents, and sent the data over it would not be an issue, but that's not how it's specified.

The easiest thing to do with this is a DOS, but multiple indirection attacks have been used to breach US gov't secure systems (not SPARQL I hasten to add), and they're very hard to protect against.

This is why 4store will not perform any indirect requests unless the user explicitly enables the feature.

- Steve

> Security is one reason for making this distinction. A complete and
> compliant implementation of SPARQL Query offered at an endpoint will
> reject updates (whether this escape change is made or not) because
> they do not parse as queries. A service is free to offer both at the
> same endpoint but it is the service's responsibility and having two
> languages makes that clear. Common practice is to have different
> endpoints for query and update, which can have different security
> setups.
> 
> SPARQL Update does not return any results except the HTTP status code.
> 
> 
>> The WG also considers the following change:
>> 
>>> • As part of the changes to the escape processing model for \u escapes, additional characters (e.g.
>>> "=", ",") would be allowed, in \u escaped form, in prefixed names.
> 
> The RDF-WG has resolved to add character escapes to prefix names for
> characters that are allowed in IRs but not currently in the local part
> of a prefix name.
> 
> http://www.w3.org/2011/rdf-wg/meeting/2011-11-30#resolution_1
> 
> The set of character is ~.-!$&'()*+,;=:/?#@%_
> 
> The SPARQL-WG has decided to also add this (as an "at-risk" feature)
> to SPARQL 1.1, also to have unescaped %-endoced sequences, and to
> leave the unicode escape processing model as it is in SPARQL 1.0.
> While the use of unicode escape sequences is still possible, the WG
> believes it will not be commonly used. Authors do not need to use
> unicode escape sequences to get characters into the local part of a
> prefix name.
> 
> 
>> I oppose this change, as there is no use case for it. Prefixed names are a convenience for authors to
>> make long IRIs easier to write and read. Escapes like \u003D and \u002C are neither easy to write nor
>> easy to read, so they defeat the purpose of prefixed names. IRIs that include such characters just have
>> to be written as absolute or relative IRIs.
>> 
>> Best,
>> Richard
> 
> We would be grateful if you would acknowledge that your comment has
> been answered by sending a reply to this mailing list.
> 
> Andy and Paul (on behalf of the SPARQL WG)
> 

-- 
Steve Harris, CTO, Garlik Limited
1-3 Halford Road, Richmond, TW10 6AW, UK
+44 20 8439 8203  http://www.garlik.com/
Registered in England and Wales 535 7233 VAT # 849 0517 11
Registered office: Thames House, Portsmouth Road, Esher, Surrey, KT10 9AD
Received on Tuesday, 6 December 2011 10:28:16 UTC