Re: "Safe Mode" processing for XSLT

On 2 June 2015 at 19:14, Florent Georges <fgeorges@fgeorges.org> wrote:

>
>   I guess the general good practice is: "never evaluate code sent to
> you".  There are other ways to provide more flexibility to your
> system, depending on what you want to achieve exactly.



The primary use case I have for user-supplied XSLT is to perform metadata
crosswalks in "Digital Library" software applications (Collection
management systems, repositories, etc). Although some digital library
applications use other mapping languages, XSLT is actually very widely used
in this role, and for good reasons. An XSLT can be as simple as you like,
but if you need it, it has all the expressive power you want. With the I/O
totally implicit, it means a metadata expert can focus exclusively on the
XML representations. This is why I think XSLT is such a popular extension
mechanism for digital library applications.

Typically, digital library applications allow only trusted administrators
to supply XSLT, and this makes a sandbox less crucial than it would
otherwise be. But in shared, hosted, environments it would be too risky to
run XSLT outside of some kind of sandbox. I'm reminded of the way that, in
a browser, XSLT and Javascript (and Java applets, in their day) are
sandboxed by the browser's "same origin" security policy, and how CORS can
relax that strict policy to allow access to other URIs. I think a similar
thing can work on the server side, if one can restrict I/O and also provide
a way to automatically terminate long-running XSLT.

I'd like to suggest that a future version of the p:xslt step should have
attributes to support sandboxing, such as a timeout (in milliseconds) and a
regex to filter URIs that may be resolved.

<p:xslt timeout="1000" allowed-uri-filter="^(http:|https:).*$"><!-- no
access to file: URIs -->
...
</p:xslt>

<p:xslt timeout="1000"
allowed-uri-filter="^file://////var//cache//xslt-sandbox//.*$">
...
</p:xslt>








...

Received on Wednesday, 3 June 2015 05:36:09 UTC