ISSUE-100 Change Proposal

Summary

    Remove the srcdoc attribute.

Rationale

The original bug report for removing srcdoc provided the following
change request[1]:

    "This recent entry does not have universal acceptance, and the
group was still
    discussing it when the editor added it to the specification.

    The supposed use case for this attribute is weblog comments, but
concerns about
    HTML security have been resolved with weblog and other application comments
    years ago. In addition, support for this attribute could give the impression
    that online sites don't need any other security, which is false. Script
    injection is only one aspect of security related to weblog comments, and
    considered a fairly trivial one at that.

    This needs to be removed from the specification."

The rationale given by the HTML5 Editor for keeping this attribute:

    "Rationale: I'm happy to remove this attribute from the W3C HTML5
specification
    if that's what the working group wants. The last time I removed a
feature based
    on a bug report such as this, I started a minor war, however, so I
suggest that
    you raise this via the change proposal process if you really feel this way."

According to the HTML5 editor, there is no rationale for keeping this
attribute. That made this change proposal more difficult to write,
because I had to base my arguments on guesses and scraped email
messages.

There was a great deal of contention about this attribute before it
was added. It spawned another issue (Issue 103) because of concerns
about escaping the markup in the attribute, especially for XHTML. That
this caused some difficulty for members of this group, who are
defining the next version of HTML/XHTML, should give us pause, because
knowing what must be escaped is going to be that much more difficult
to the average web developer[2][3].

When asked the purpose for srcdoc, the HTML5 Editor replied that the
use case for the attribute is weblog comments[4]. Because the srcdoc
attribute works within a sandboxed context, the use of the attribute
would prevent script injection in comments. Since this change was
targeted to a specific use related to weblog software, I asked Matt
Mullenweg[5], the creator of WordPress, one of the more popular
weblogging tools in use today, about the usefulness of this attribute.
He responded with[6]:

   "We haven't had any HTML-level problems in comments in a while.

    We use and maintain a library called KSES that we use for all
    sanitation, and it has served us well."

I brought Matt into the discussion for two reasons. The first is that
I wanted to bring in an "implementor", and demonstrate that an
implementor, in the case of weblog comments, is the the group or
individual responsible for the weblogging software. Too often this
group is focused purely on browser developers as implementors,
forgetting that browsers are not the only application group impacted
by HTML5 changes.

The second reason was to demonstrate that no one from the weblogging
community has asked for this, and it is very unlikely that many, if
not most, of the weblogging community will use this uncomfortable,
awkward attribute. The weblogging community has long had to deal with
security problems, and has devised sophisticated tools and techniques
to not only protect against script injection, but also SQL injection,
the greater hazard for weblog comments, and even the accidental
wayward insertion of a non-printing character in XHTML.

In point of fact, relying on something such as srcdoc can make a site
less secure rather than more, because it only touches on one
vulnerability, when we're faced daily with a host of new and ever more
sophisticated threats[7].

So the use case is heavily flawed. What are the other issues
associated with srcdoc? I've already mentioned the concerns about
escaped characters, and how this will differ between HTML and XHTML,
which in itself will discourage its use with most applications like
Content Management Systems. Are there other issues?

Another issue is when something like srcdoc can be used, and if the
restrictions of the use are such as to defeat its use. This attribute
can't be used effectively for potentially years in the future, because
web browsers don't print out what's contained in the attributes—not
unless specifically directed to do so[8]. Until then, the fallback is
used, which is the iframe's src attribute. Until browsers support
sandboxing, though, the src attribute is insecure.

In the meantime, our existing applications that do provide security
become more sophisticated, more capable, more tightly integrated,
until by the time we could use srcdoc effectively, few of us will even
remember what it is, and fewer still, would be interested.

An alternative to srcdoc was suggested in the discussion surrounding
this attribute. Instead of embedding markup in the attribute—something
that has been actively discouraged for some time— we can use a data
URI with the src attribute, getting the same functionality that can be
more quickly usable and won't require us to embed markup in an
attribute. However, the data URI has its own challenges, specifically
the fact that the data would be printed out without the security
controls in legacy browsers [9]. Again, though, using a data URI in an
iframe src attribute would most likely never be used for weblog
comments. I find it unlikely that any approach related to the iframe
and sandboxing will ever be used with weblog comments, so it might be
best if another use case is used to attempt to defend this attribute.

One use case that does come to mind are the plug-ins we drop into our
web pages. The source of the plug-in comes from an external site,
which could be cause for alarm. However, plug-in security is not
related to the srcdoc attribute, so I have a hard time determining
what use case would apply. Perhaps there are none, in which case,
there's even more of a reason to remove this potentially harmful, most
definitely problematic attribute.

Details

Remove all references to the srcdoc attribute from the HTML5
specification. If such a removal results in a gap in coverage,
consider following one of two paths: remove whatever other material is
necessary to eliminate the gap or work with the W3C HTML WG to come up
with an alternative approach, if one can be found.

I would also strongly suggest finding another use case, if you want to
pursue this type of functionality.

Impact

Positive

Removes a confusing, potentially harmful, and not really usable
attribute, either forcing us to re-address the issue, or to consider
dropping this particular subset of web page security from the HTML5
specification. Perhaps there are some aspects of the web that cannot
be managed by browsers.

Negative

Requires some of the Editor's time to make the change. Could
potentially leave a gap in coverage, if this subset of security is
still of interest. Would require more work in the HTML WG. However,
counter proposals to this proposal might be able to provide effective
alternatives. Or not, if none really exists.

References


[1] http://www.w3.org/Bugs/Public/show_bug.cgi?id=8818

[2] http://www.w3.org/html/wg/tracker/issues/103

[3] http://lists.w3.org/Archives/Public/public-html/2010Mar/0431.html

[4] http://lists.w3.org/Archives/Public/public-html/2010Jan/1193.html

[5] http://lists.w3.org/Archives/Public/public-html/2010Jan/1223.html

[6] http://lists.w3.org/Archives/Public/public-html/2010Jan/1337.html

[7] http://lists.w3.org/Archives/Public/public-html/2010Jan/1318.html

[8] http://lists.w3.org/Archives/Public/public-html/2010Jan/1325.html

[9] http://lists.w3.org/Archives/Public/public-html/2010Jan/1346.html

-------------

Shelley Powers
http://realtech.burningbird.net

Received on Wednesday, 31 March 2010 13:51:25 UTC