- From: Thomas Broyer <t.broyer@ltgt.net>
- Date: Sun, 24 Jan 2010 23:55:19 +0100
- To: "Tab Atkins Jr." <jackalmage@gmail.com>
- Cc: "public-html@w3.org WG" <public-html@w3.org>
On Sun, Jan 24, 2010 at 11:41 PM, Tab Atkins Jr. <jackalmage@gmail.com> wrote: > On Sun, Jan 24, 2010 at 2:16 PM, Tab Atkins Jr. <jackalmage@gmail.com> wrote: >> Or do the standard url-escaping functions built into basically >> all programming languages cover it completely? > > The answer, by the way, is no. I can't speak for other languages, but > PHP's standard url escaping function, urlencode(), will escape spaces > as +. data: urls require spaces to be encoded as %20. > > Test case provided by Philip`: "data:text/html".urlencode("a b") > produces "data:text/html,a+b", which produces a page containing the > text "a+b". > > So, for PHP, the most common web-programming language on the internet, > authors would have to write their own url escaping function for data: > urls. This is a non-trivial matter, especially when unicode is > involved, opening them to the possibility of attack. Compare to the > srcdocEscape function I wrote earlier: > > function srcdocEscape($html) { > return strtr($html,array("&"=>"&", '"'=>""")); > } > > Trivial and correct. Correct me if I'm wrong, but PHP's rawurlencode() should do what's needed here (otherwise, "fixing" urlencode() should be as easy as str_replace('+', '%20', urlencode($html)); because urlencode() should have encoded pluses into %2B already) -- Thomas Broyer /tɔ.ma.bʁwa.je/
Received on Sunday, 24 January 2010 22:56:15 UTC