W3C home > Mailing lists > Public > public-html-comments@w3.org > April 2010

Re: HTML 5

From: Eduard Pascual <herenvardo@gmail.com>
Date: Wed, 7 Apr 2010 20:06:10 +0200
Message-ID: <x2u6ea53251004071106g1362ff92peb1a7b663201d1b2@mail.gmail.com>
To: "T.J. Crowder" <tj@crowdersoftware.com>
Cc: gesteehr@googlemail.com, public-html-comments@w3.org
On Wed, Apr 7, 2010 at 5:49 PM, T.J. Crowder <tj@crowdersoftware.com> wrote:
>> > What makes ]]> easier to defend against than </code>?
>>
>> As I said, with <![CDATA[ ... ]]> you only need to care about the
>>
>> exact sequence "]]>": if it's found within an input, get rid of it or
>>
>> somehow fix it (string replacement "]]>" => "]]>]]&gt;<![CDATA[" gets
>>
>> the job done safely). With </code> (or even with Arthur's <cdata>
>>
>> suggestion, to some degree), things are quite more complex:
>
> I don't understand. Any sanitizer has to escape < and &.
Not any: for content that will go inside <![CDATA[ ... ]]> there is no
need at all to care about < and &. That's the whole point of CDATA. In
other words, a < inside a CDATA block is exactly equivalent to a &lt;
outside of it: it will have no special meaning and just render as "<".
The same holds for & and &amp;, and also for > and &gt;.

For content generated programatically, it's quite indifferent to use
CDATA or to escape stuff. For manually authored content, CDATA saves a
lot of authoring pain (I'm assuming this is the case Georg had in mind
when starting this thread). If used with user-provided content,
Georg's proposal would open up a potential for injection attacks that
require the spec, implementations, and server-side scripts to do a
good deal of non-trivial fool-proofing. CDATA addresses the use-case,
without so many nasty side effects.

Regards,
Eduard Pascual
Received on Wednesday, 7 April 2010 18:07:02 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 1 June 2011 00:14:02 GMT