- From: Julian Reschke <julian.reschke@gmx.de>
- Date: Sun, 17 Jan 2010 14:09:48 +0100
- To: Philip Taylor <pjt47@cam.ac.uk>
- CC: Lachlan Hunt <lachlan.hunt@lachy.id.au>, Ian Hickson <ian@hixie.ch>, public-html@w3.org
Philip Taylor wrote: > ... > Why "especially for the doc proposal"? The ampersand problem seems the > same for any markup-in-attribute proposal, and doc has far fewer > escaping problems than the data: alternative. > ... Hmmm... Really? > Presumably almost nobody is ever going to write the markup by hand, > since the point is to embed untrusted content in a sandbox, and if > you're embedding it by hand you can verify the content visually and > don't need to sandbox it. So the important thing is how server-side code > will do the escaping. +1 To get these things right (as in "working" and "valid"), it's almost certain you won't be hand-authoring this. > If you have a (Perl) script which does something like > > print "<iframe sandbox doc=\"$doc\">"; > > you'll have to escape with something like > > s/"/"/g; > > in order to avoid security vulnerabilities, and also with > > s/&/&/g; > > in order to get correct processing. If you instead had And also "<" when using XHTML. > print "<iframe sandbox src=\"data:text/html;charset=utf-8,$doc\">"; > > you'd still just have to escape " for safety; but for correct processing > in current browsers you'd have to at least escape & and do > > s/%/%25/g; > s/#/%23/g; > > (are there any others you need?) and for validity I think you'd have to > instead do > > s/([^;\/?:@&=+$,a-zA-Z0-9-_.!~*'()])/join "", map { sprintf "%%%02x", > $_ } unpack "C*", encode("utf-8", $1)/eg; > > (if I interpret RFC2397's reference to RFC2396's "urlchar" as actually > meaning "uric", and if I haven't made stupid mistakes). Or you might already have a library that encodes URIs/IRIs. > Your server-side script probably already has access to an HTML escape > function that will do what's needed for <iframe doc>, and if you have a > decent templating system it'll do it automatically. It's no different to > any other form of embedding content from the user, so it doesn't seem an > unreasonable burden. (Escaping data: correctly is a lot more complex and > a lot less likely to be provided as a function in your server environment.) Even if this is true that should be relatively simple to fix. Given the fact that introducing new attributes to HTML is very expensive ([1]), making "data:" work for this should really be considered. BR, Julian [1] <http://krijnhoetmer.nl/irc-logs/whatwg/20081120#l-184>
Received on Sunday, 17 January 2010 13:10:32 UTC