W3C home > Mailing lists > Public > public-html@w3.org > January 2010

Re: <iframe doc="">

From: Julian Reschke <julian.reschke@gmx.de>
Date: Sun, 17 Jan 2010 14:09:48 +0100
Message-ID: <4B530C1C.400@gmx.de>
To: Philip Taylor <pjt47@cam.ac.uk>
CC: Lachlan Hunt <lachlan.hunt@lachy.id.au>, Ian Hickson <ian@hixie.ch>, public-html@w3.org
Philip Taylor wrote:
> ...
> Why "especially for the doc proposal"? The ampersand problem seems the 
> same for any markup-in-attribute proposal, and doc has far fewer 
> escaping problems than the data: alternative.
> ...

Hmmm... Really?

> Presumably almost nobody is ever going to write the markup by hand, 
> since the point is to embed untrusted content in a sandbox, and if 
> you're embedding it by hand you can verify the content visually and 
> don't need to sandbox it. So the important thing is how server-side code 
> will do the escaping.

+1

To get these things right (as in "working" and "valid"), it's almost 
certain you won't be hand-authoring this.

> If you have a (Perl) script which does something like
> 
>   print "<iframe sandbox doc=\"$doc\">";
> 
> you'll have to escape with something like
> 
>   s/"/&quot;/g;
> 
> in order to avoid security vulnerabilities, and also with
> 
>   s/&/&amp;/g;
> 
> in order to get correct processing. If you instead had

And also "<" when using XHTML.

>   print "<iframe sandbox src=\"data:text/html;charset=utf-8,$doc\">";
> 
> you'd still just have to escape " for safety; but for correct processing 
> in current browsers you'd have to at least escape & and do
> 
>   s/%/%25/g;
>   s/#/%23/g;
> 
> (are there any others you need?) and for validity I think you'd have to 
> instead do
> 
>   s/([^;\/?:@&=+$,a-zA-Z0-9-_.!~*'()])/join "", map { sprintf "%%%02x", 
> $_ } unpack "C*", encode("utf-8", $1)/eg;
> 
> (if I interpret RFC2397's reference to RFC2396's "urlchar" as actually 
> meaning "uric", and if I haven't made stupid mistakes).

Or you might already have a library that encodes URIs/IRIs.

> Your server-side script probably already has access to an HTML escape 
> function that will do what's needed for <iframe doc>, and if you have a 
> decent templating system it'll do it automatically. It's no different to 
> any other form of embedding content from the user, so it doesn't seem an 
> unreasonable burden. (Escaping data: correctly is a lot more complex and 
> a lot less likely to be provided as a function in your server environment.)

Even if this is true that should be relatively simple to fix.

Given the fact that introducing new attributes to HTML is very expensive 
([1]), making "data:" work for this should really be considered.

BR, Julian

[1] <http://krijnhoetmer.nl/irc-logs/whatwg/20081120#l-184>
Received on Sunday, 17 January 2010 13:10:32 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 9 May 2012 00:16:57 GMT