Re: Security concerns for N3 + built-ins? from Gregg Kellogg on 2022-04-12 (public-n3-dev@w3.org from April 2022)

From: Gregg Kellogg <gregg@greggkellogg.net>
Date: Tue, 12 Apr 2022 08:09:51 -1000
To: William Van Woensel <william.vanwoensel@gmail.com>
Cc: Patrick Hochstenbach <Patrick.Hochstenbach@ugent.be>, public-n3-dev@w3.org, Jos De Roo <Jos.DeRoo@ugent.be>
Message-Id: <169AAD82-756C-4583-AB66-EA7A2A9EE8F4@greggkellogg.net>
On Apr 12, 2022, at 7:09 AM, William Van Woensel <william.vanwoensel@gmail.com> wrote:
> 
> Hi Patrick
> 
> Thanks for your question; these are very valid concerns with third-party N3 documents.
> 
> AFAIK, unless the N3 reasoner offers a feature for arbitrary code execution in the underlying implementation language (e.g., Prolog, Java) then I don’t think they’d be able to execute arbitrary code (if I understand “arbitrary” correctly here). (Of course, barring cases where implementation bugs could open the door for this.) For instance, I believe Eye has an e:derive feature to directly interpret Prolog predicates, which may open up this particular can of worms. You’d have to ask Jos De Roo for more about how to disallow this feature (and e.g., disable builtins that load remote content). The jen3 reasoner (experimental) has a builtins file where all supported builtins and their input constraints are listed; one can easily edit that file to only include desired builtins, or put additional constraints on their input
> 
> But indeed, a ’strict’ mode could be recommended by the N3 spec that only allows standard builtins; and a ’sandbox’ mode to disallow access to networked/file resources, as you suggest.

There are still attack vectors. A “man in the middle” could intercept the load from log:semantics and return arbitrary N3 which would be reasoned over in the source container. Or, some improper code could potentially be used to create a variable, which when used with an operator that loads remote N3 could introduce something unwanted.

A way for the spec to deal with this securely might be similar to the JSON-LD document loader, which can be used to whitelist or otherwise restrict what external documents can be accessed. But, this would require defining some kind of API, which we haven’t done.

At some point, N3 may require a review by security experts. 

Gregg

> Regards,
> 
> William
> 
>> On Apr 12, 2022, at 11:55 AM, Patrick Hochstenbach <Patrick.Hochstenbach@ugent.be> wrote:
>> 
>> Hi William
>>  
>> What I mean is a way to run any N3 processor in a very strict mode that includes:
>>  
>> Only built-ins as provided by the spec
>> In extremis, disallowing N3 rules having access to networked/file resources
>>  
>> In this way when my servers process N3 documents from third party authors , they are only restricted to the N3 data in memory and without capabilities to execute arbitrary code.
>>  
>> Patrick
>>  
>> From: William Van Woensel <william.vanwoensel@gmail.com>
>> Date: Tuesday, 12 April 2022 at 15:39
>> To: Patrick Hochstenbach <Patrick.Hochstenbach@UGent.be>
>> Cc: public-n3-dev@w3.org <public-n3-dev@w3.org>
>> Subject: Re: Security concerns for N3 + built-ins?
>> 
>> Hi Patrick
>>  
>> Before answering, I’m unsure what you mean by executing N3 code in isolation?
>>  
>>  
>> William
>> 
>> 
>> On Apr 12, 2022, at 2:12 AM, Patrick Hochstenbach <Patrick.Hochstenbach@UGent.be> wrote:
>>  
>> Thank you William this is very helpful and gives me also some references to study. The EYE reasoner provides an option to execute N3 in isolation. I was wondering if this is just an implementation feature or something that is by design a consideration for any implementation of the N3 spec (or some metadata that can be available which built-ins have these non-isolated capabilities)?
>>  
>> Patrick
>> From: William Van Woensel <william.vanwoensel@gmail.com>
>> Sent: 11 April 2022 16:56
>> To: Patrick Hochstenbach <Patrick.Hochstenbach@UGent.be>
>> Cc: public-n3-dev@w3.org <public-n3-dev@w3.org>
>> Subject: Re: Security concerns for N3 + built-ins?
>>  
>> Hi Patrick,
>>  
>> Re arbitrary programming abilities, both the jen3 and Eye reasoners allow plugging in custom built-ins - really these can do whatever the underlying programming language allows (Java or Prolog, respectively). Of course, these custom builtins would not be part of the spec - a third-party reasoner implements them at their own risk. Standard built-ins (i.e., part of the spec) have a very specific and well-delineated purpose (e.g., math:sum, log:includes, ..) and should not allow executing arbitrary code.
>>  
>> Re talking to the outside world, the only standard builtins that come to mind are log:semantics and log:content (see here for a description) as they are able to load (and parse) content from a URL. If privacy-sensitive data was somehow encoded in this URL (e.g., personal ID) then that would be leaked to the outside. However, there is no risk of (for instance) leaking variable values within a rule that uses these builtins, since no distributed reasoning takes place: the content is simply downloaded and then used in the rule. 
>>  
>> (That said, log:conclusion allows reasoning over this retrieved content; if the content itself contains a log:semantics directive, then one can imagine situations where a bunch of requests being sent, and data being downloaded.)
>>  
>>  
>> HTH,
>>  
>> William
>> 
>> 
>> On Apr 11, 2022, at 4:09 AM, Patrick Hochstenbach <Patrick.Hochstenbach@UGent.be> wrote:
>>  
>> Hello all,
>>  
>> I could need some help to get more insights in the Notation3 specs + built-ins regarding allowing arbitrary programming capabilities and possible side-effects that Notation3 implementations should allow for when implementing the specs. 
>>  
>> Are N3 implementations that follow the specs isolated systems, or do they need they talk to the external world (e.g., accessing external resources)?
>>  
>> I am wondering about this because N3 is quite attractive to be executed near protected resources (e.g., in a the security context of a Solid pod). If I would allow executing arbitrary third party authored Notation3 documents, would there be a risk of information leaking into to the world (Notation3 accessing more information than that I intend to)?
>>  
>> It is not clear for me what side-effects are (including dereferencing URI-s) that must be supported by implementations.
>>  
>> BR
>> Patrick
>
Received on Tuesday, 12 April 2022 18:11:11 UTC