- From: Tab Atkins Jr. <jackalmage@gmail.com>
- Date: Sun, 24 Jan 2010 17:19:15 -0600
- To: Thomas Broyer <t.broyer@ltgt.net>
- Cc: "public-html@w3.org WG" <public-html@w3.org>
On Sun, Jan 24, 2010 at 4:55 PM, Thomas Broyer <t.broyer@ltgt.net> wrote: > On Sun, Jan 24, 2010 at 11:41 PM, Tab Atkins Jr. <jackalmage@gmail.com> wrote: >> On Sun, Jan 24, 2010 at 2:16 PM, Tab Atkins Jr. <jackalmage@gmail.com> wrote: >>> Or do the standard url-escaping functions built into basically >>> all programming languages cover it completely? >> >> The answer, by the way, is no. I can't speak for other languages, but >> PHP's standard url escaping function, urlencode(), will escape spaces >> as +. data: urls require spaces to be encoded as %20. >> >> Test case provided by Philip`: "data:text/html".urlencode("a b") >> produces "data:text/html,a+b", which produces a page containing the >> text "a+b". >> >> So, for PHP, the most common web-programming language on the internet, >> authors would have to write their own url escaping function for data: >> urls. This is a non-trivial matter, especially when unicode is >> involved, opening them to the possibility of attack. Compare to the >> srcdocEscape function I wrote earlier: >> >> function srcdocEscape($html) { >> return strtr($html,array("&"=>"&", '"'=>""")); >> } >> >> Trivial and correct. > > Correct me if I'm wrong, but PHP's rawurlencode() should do what's > needed here (otherwise, "fixing" urlencode() should be as easy as > str_replace('+', '%20', urlencode($html)); because urlencode() should > have encoded pluses into %2B already) Nah, you're right. It's just yet another function to remember to use, among the plethora of similar-but-different functions in PHP. I'd forgotten about it. ~TJ
Received on Sunday, 24 January 2010 23:20:07 UTC