W3C home > Mailing lists > Public > whatwg@whatwg.org > August 2010

[whatwg] base64 entities

From: Ryosuke Niwa <ryosuke.niwa@gmail.com>
Date: Wed, 25 Aug 2010 14:32:05 -0700
Message-ID: <AANLkTi=Cgs0f0ihQUtJWRw91vMoN9TBywU-rYh4_wO-e@mail.gmail.com>
Does ECMAScript currently have a built-in function for encoding & decoding
base-64?  We might want a built-in base-64 encoder / decoder if we are
implementing this base64-encoded entities.

- Ryosuke

On Wed, Aug 25, 2010 at 1:50 PM, Adam Barth <w3c at adambarth.com> wrote:

> == Summary ==
>
> HTML should support Base64-encoded entities to make it easier for
> authors to include untrusted content in their documents without
> risking XSS.  For example,
>
> &%SFRNTDUncyA8Y2FudmFzPiBlbGVtZW50IGlzIGF3ZXNvbWUuCg==;
>
> would decode to "HTML5's <canvas> element is awesome."  Notice that
> the < and > characters get emitted by the parser as character tokens.
> That means they can't be used by an attacker for XSS.  These entities
> can be used safely both in intertag content as well as in attribute
> values.
>
> == Use Case ==
>
> Authors often combine trusted and untrusted text into HTML documents.
> If done naively, an attacker can supply HTML markup, including script,
> in the untrusted script, resulting in a cross-site script attack.
> Authors want a way to include untrusted content safely in HTML
> documents without risking XSS.
>
> == Workarounds ==
>
> Currently, authors must carefully escape all untrusted content to
> prevent an attacker from injecting HTML.  Unfortunately, authors often
> apply the incorrect escaping or forget to escape entirely, resulting
> in security vulnerabilities.  Escaping content in HTML is tricky
> because authors need to use different escaping rules for different
> contexts.  For example, PHP's htmlspecialchars isn't sufficient in the
> following contexts:
>
> <img alt=<?php echo htmlspecialchars($name) ?> src="...">
>
> <script>
> elmt.innerHTML = 'Hi there <?php echo htmlspecialchars($name) ?>.';
> </script>
>
> Some framework convert untrusted content to a series of hex entities,
> but that greatly increases the length of the content.
>
> == Proposal ==
>
> We should add a new kind of HTML entity that authors can use to
> include untrusted content.  In particular, authors should be able to
> supply untrusted content in base64, which nicely avoids any scary
> characters.  We can avoid clashes with existing or future entities by
> using a new character after the & escape character.  In particular, we
> could use the % character:
>
> &%SFRNTDUncyA8Y2FudmFzPiBlbGVtZW50IGlzIGF3ZXNvbWUuCg==;
>
> Authors could then supply untrusted content as follows:
>
> <img alt=<?php echo htmlescape($name) ?> src="...">
>
> where htmlescape is defined as follows:
>
> function htmlescape($text) {
>  return "&%".base64_encode($text).";";
> }
>
> Adam
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20100825/1e56f8d1/attachment.htm>
Received on Wednesday, 25 August 2010 14:32:05 UTC

This archive was generated by hypermail 2.3.1 : Monday, 13 April 2015 23:09:00 UTC