- From: Bijan Parsia <bparsia@cs.man.ac.uk>
- Date: Wed, 28 Jan 2009 11:37:17 +0000
- To: Bijan Parsia <bparsia@cs.manchester.ac.uk>
- Cc: Alan Ruttenberg <alanruttenberg@gmail.com>, Jonathan Rees <jar@creativecommons.org>, W3C OWL Working Group <public-owl-wg@w3.org>
I was thinking about the problems of sending code to a server and how people might not want to send their data outside, and I realized that this might be considered a feature. If it's clear that random GRDDL is not trustworthy, there will be less blind trusting. I just noticed that most of the exploits I was thinking of I are warned about in: http://www.w3.org/TR/grddl/#sec Note that I don't think there are any reasonably secure GRDDL agents out there. Consider the Jena description: http://jena.sourceforge.net/grddl/security-conformance.html I see no security description of glean.py: http://www.w3.org/2003/g/glean.py but the key lines: def doXSLT(xform, inf, outf, params = {}): args = ["xsltproc", "--novalid", "-o", outf] for k in params.keys(): args.extend(("--stringparam", k, params[k])) spawn(XSLTPROC, args + [xform, inf]) Do no feature disabling except validation and explictly permit writing to the filesystem "-o". GRDDL.py has no secuity description (in the source code): http://www.w3.org/2001/sw/grddl-wg/td/GRDDL.py It's not obvious to me that the processor is secured in any way: result = processor.runNode(self.dom, self.url, ignorePis=1) I see a "zone" argument, so there might be something configurable, but not *inside* the XSLT processer, AFAICT. It seems to write to file in normal or fairly obvious mode. Raptor: http://librdf.org/raptor/api/parser-grddl.html allows setting a timeout for URI retrieval but not, afaict, for the XSLT processing. I don't know if there is anything inside. I didn't see any security discussion. BTW, I do not mean this *AT ALL* as criticism of these libraries or their authors. AFAICT, the software is perfectly fine and does not claim to be more than it is. The Jena security discussion bends over backwards to be cautious and appropriately warning. However, these facts make it unclear how heavily we should weight concerns about exposing sensitive data here. If we *are* concerned, the simplest thing to do is not to provide a auto-downloadable transform at all. If the W3C would like to host a version of the OWL API based converter, that is, I think fine. If software wants to use that at it's own, explicit, risk, that's up to them. (A la HTML editors using the W3C HTML validator.) There's another class of exploits, of course, based on spoofing the w3c site. (I'm unclear what happens if I include a explict transform attribute whether that overrides the namespace.. ""grddl:transformation="glean_title.xsl http://www.w3.org/2001/sw/grddl-wg/td/getAuthor.xsl"""".) Are W3C hosted GRDDL transforms cryptographically signed? (All this is additional to intended or accidental DDOS attacks against the W3C either for downloading or for processing.) As far as I know, there's no requirement on GRDDL agents to notify a user when they download or use unaudited code. Cheers, Bijan.
Received on Wednesday, 28 January 2009 11:33:52 UTC