- From: Jeremy Carroll <jjc@hpl.hp.com>
- Date: Thu, 01 Feb 2007 18:54:15 +0000
- To: public-grddl-comments@w3.org
- CC: "McBride, Brian" <brian.mcbride@hp.com>
It is plausible that the Working Group should regard it as in-scope to
advise implementers as to which URLs should not be dereferenced, even
though such dereferencing is required to implement a GRDDL transform.
Currently my code distinguishes between the following, based on the
purpose and timing of the dereferencing:
- permitted:
URL of initial document
URL of schema or profile document
URL of any required transforms explicitly mentioned in the document
URL of schema or profile of schema or profile document
URL of any required transforms explicitly mentioned in the schema
or profile
URL of any required transforms in the GRDDL result document
(Note: that this last URL may be constructed programmatically,
by untrusted code, that may have malicious gained access to private data
on the user's machine. This hence poses a greater risk than the previous
ones)
- prohibited
URL of any document() or doc() instruction in the XSLT
URL of any unparsed-text() instruction (the implementation of the
prohibition is different)
A different approach would be to permit certain URLs and prohibit
others, based on internal characteristics of the URL. e.g. prohibit at
all points (except perhaps the first), a URL without a hostname (e.g. a
file: URL). Only permit URLs whose hostname comes from some restricted
set (e.g. the hostname from which the initial document was retrieved,
and any hosts explicitly mentioned in that document).
Potential attacks could be based on using the ability to access some
URLs from the client, in order to gain access to information that is not
normally available to the attacker, but to which the end-user has
privileged access, and then to pass this information back to the
attacker within another URL.
Another potential attack is to have a document on the web whose GRDDL
transform is set as some transforming shipping with some well-known
Operating System or software package. That transform may have access or
rights to local resources that permit malicious intent.
Jeremy
Received on Thursday, 1 February 2007 18:54:44 UTC