W3C home > Mailing lists > Public > public-grddl-comments@w3.org > January to March 2007

Permitted URLs for transformations/schemas/profiles

From: Jeremy Carroll <jjc@hpl.hp.com>
Date: Thu, 01 Feb 2007 18:54:15 +0000
Message-ID: <45C23757.3050804@hpl.hp.com>
To: public-grddl-comments@w3.org
CC: "McBride, Brian" <brian.mcbride@hp.com>


It is plausible that the Working Group should regard it as in-scope to 
advise implementers as to which URLs should not be dereferenced, even 
though such dereferencing is required to implement a GRDDL transform.

Currently my code distinguishes between the following, based on the 
purpose and timing of the dereferencing:

- permitted:
      URL of initial document
      URL of schema or profile document
      URL of any required transforms explicitly mentioned in the document
      URL of schema or profile of schema or profile document
      URL of any required transforms explicitly mentioned in the schema 
or profile
      URL of any required transforms in the GRDDL result document
         (Note: that this last URL may be constructed programmatically, 
by untrusted code, that may have malicious gained access to private data 
on the user's machine. This hence poses a greater risk than the previous 
ones)

- prohibited
      URL of any document() or doc() instruction in the XSLT
      URL of any unparsed-text() instruction (the implementation of the 
prohibition is different)

A different approach would be to permit certain URLs and prohibit 
others, based on internal characteristics of the URL. e.g. prohibit at 
all points (except perhaps the first), a URL without a hostname (e.g. a 
file: URL). Only permit URLs whose hostname comes from some restricted 
set (e.g. the hostname from which the initial document was retrieved, 
and any hosts explicitly mentioned in that document).

Potential attacks could be based on using the ability to access some 
URLs from the client, in order to gain access to information that is not 
normally available to the attacker, but to which the end-user has 
privileged access, and then to pass this information back to the 
attacker within another URL.

Another potential attack is to have a document on the web whose GRDDL 
transform is set as some transforming shipping with some well-known 
Operating System or software package. That transform may have access or 
rights to local resources that permit malicious intent.

Jeremy
Received on Thursday, 1 February 2007 18:54:44 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 19:55:02 UTC