W3C home > Mailing lists > Public > public-html@w3.org > January 2010

Re: <iframe doc="">

From: Shelley Powers <shelley.just@gmail.com>
Date: Sun, 24 Jan 2010 12:04:33 -0600
Message-ID: <643cc0271001241004i11047608u7ac0010202f983d9@mail.gmail.com>
To: "Tab Atkins Jr." <jackalmage@gmail.com>
Cc: Ian Hickson <ian@hixie.ch>, "public-html@w3.org WG" <public-html@w3.org>, matt@mullenweg.com
On Sun, Jan 24, 2010 at 11:14 AM, Tab Atkins Jr. <jackalmage@gmail.com> wrote:
> On Sun, Jan 24, 2010 at 10:55 AM, Shelley Powers <shelley.just@gmail.com> wrote:
>> This is an old issue. We have had software to sanitize comments for a
>> long time. It's built into most CMS tools. And for those who disregard
>> the use of such tools, they're not going to use this, either.
>
> Indeed, there are nearly as many html-sanitizers as there are CMSes.
> And they're pretty uniformly bad.  Most of them are built on fragile
> regexps, if you're lucky.  They might just be a handful of string
> replaces that address whatever problems the CMS author could think of
> at the time.  The best of them address *currently known attack
> vectors* decently enough, but are usually weak to *new* attacks.
>

Most are not bad, many are good, a few are exceptional. I don't
believe either Drupal or Wordpress are vulnerable to script attacks in
comments. Do you have a demonstration how script attacks would
circumvent the protections in place in these CMS? When they're using,
oh, something like htmLawed?

I've also cc'd Wordpress's Matt Mullenweg, since we're talking about
how vulnerable a CMS such as Wordpress is when it comes to sanitizing
comment content. Perhaps he could provide his view on the matter on
this vulnerability, if he has time. Matt, would you mind giving us
your view on vulnerability of comments in CMS today?


> To do it properly you need a full HTML parser/tokenizer combined with
> a whitelist-based sanitizer.  These are very rarely used in the wild,
> and when they are used we can only hope they're not buggy relative to
> browsers.  Only now, with HTML5, can we have a hope that independently
> built HTML parsers will actually produce the same structure from a
> particular piece of HTML, including in edge cases, so that attack
> vectors can be spotted and headed off by anyone and the knowledge
> spread to everyone.
>

Are you saying that this is the rationale for this change?

If so, do you have specific examples of these commonly occurring
vulnerabilities in existing santizer technologies? You have specific
ways to circumvent the sanitizers?

> On the other hand, @srcdoc makes this whole thing trivial, and allows
> us to leverage the behavioral restraints of @sandbox as well.  It's a
> win for everyone.  The only loss is if you were somehow silly enough
> to write code with @srcdoc by hand, and I've already explained why
> that's a silly thing to do.
>

But people have to write the templates by hand. At some point in time,
humans are involved in web pages. Whether they write the code to
generate the content, design the templates to use the code, or yes,
even create the web page by hand--humans are involved.

Ultimately, this stuff has to be meaningful for humans in order to
work. This change, is not meaningful.


>> Rationales should always be provided, consensus should be sought with
>> major changes. This a major change. The editor should not be making
>> unilateral decisions -- and neither should the chairs.
>
> Rationales were provided, both in the previous discussions around
> iframe sandboxing, and the more recent one where Ian announced that he
> wanted to add @doc.
>
>> Decisions are made, true, but they should be made according to the
>> strength of arguments provided, not the fact that Ian has edit control
>> over the document.
>
> Making decisions on the strength of arguments is not consensus.
> That's technical merit, which is something entirely different, and is
> in fact precisely how the working group works.  Ian decides what to
> put in the spec by technical merit, and when the Decision Policy is
> invoked, the Chairs decide what to do based on technical merit.
>

I do not think this decision has technical merit. Speaking of
which...what exactly was the technical merit for this decision?  I
missed that in all of the emails. Can someone point me to the
rationale for this change?


>> I want to ask: which implementing company asked for this change?
>> That's all it took for this to be incorporated, one implementor asked
>> for it. I want to know which company/person specifically asked for
>> this change?
>
> How is that possibly relevant?  I thought you just said that decisions
> should be made based on the strength of arguments.  I don't see how
> the identity of the people proposing features affects the strength of
> an argument.
>

I think it's very relevant. I would expect whoever to ask for this to
provide use cases and a good rationale for this change. We can't ask
them for this, though, because we don't know who they are.

But I'm willing to forgo this, if someone can point me to the
rationale for this change. To judge the merit of a technical decision,
we need the rationale for the change, the purpose behind asking for
the change, the use cases to justify the change, as well as a good
understanding of the alternative approaches--to see if there isn't
another approach that has superior technical merit.

Tab, can you provide links to these? Ian, you made this change -- can
you provide this information, or provide links where these exist?

Thanks

Shelley
Received on Sunday, 24 January 2010 18:05:01 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 9 May 2012 00:17:00 GMT