- From: Eduard Pascual <herenvardo@gmail.com>
- Date: Wed, 7 Apr 2010 20:06:10 +0200
- To: "T.J. Crowder" <tj@crowdersoftware.com>
- Cc: gesteehr@googlemail.com, public-html-comments@w3.org
On Wed, Apr 7, 2010 at 5:49 PM, T.J. Crowder <tj@crowdersoftware.com> wrote: >> > What makes ]]> easier to defend against than </code>? >> >> As I said, with <![CDATA[ ... ]]> you only need to care about the >> >> exact sequence "]]>": if it's found within an input, get rid of it or >> >> somehow fix it (string replacement "]]>" => "]]>]]><![CDATA[" gets >> >> the job done safely). With </code> (or even with Arthur's <cdata> >> >> suggestion, to some degree), things are quite more complex: > > I don't understand. Any sanitizer has to escape < and &. Not any: for content that will go inside <![CDATA[ ... ]]> there is no need at all to care about < and &. That's the whole point of CDATA. In other words, a < inside a CDATA block is exactly equivalent to a < outside of it: it will have no special meaning and just render as "<". The same holds for & and &, and also for > and >. For content generated programatically, it's quite indifferent to use CDATA or to escape stuff. For manually authored content, CDATA saves a lot of authoring pain (I'm assuming this is the case Georg had in mind when starting this thread). If used with user-provided content, Georg's proposal would open up a potential for injection attacks that require the spec, implementations, and server-side scripts to do a good deal of non-trivial fool-proofing. CDATA addresses the use-case, without so many nasty side effects. Regards, Eduard Pascual
Received on Wednesday, 7 April 2010 18:07:02 UTC