W3C home > Mailing lists > Public > whatwg@whatwg.org > August 2010

[whatwg] base64 entities

From: Adam Barth <w3c@adambarth.com>
Date: Wed, 25 Aug 2010 14:32:53 -0700
Message-ID: <AANLkTin0S-=RRdu5w-x656v0a1X7=k_S8LuY+qqg9WgV@mail.gmail.com>
btoa and atob should do the trick.

Adam


On Wed, Aug 25, 2010 at 2:32 PM, Ryosuke Niwa <ryosuke.niwa at gmail.com> wrote:
> Does ECMAScript currently have a built-in function for encoding & decoding
> base-64? ?We might want a built-in base-64 encoder / decoder if we are
> implementing this base64-encoded entities.
> - Ryosuke
> On Wed, Aug 25, 2010 at 1:50 PM, Adam Barth <w3c at adambarth.com> wrote:
>>
>> == Summary ==
>>
>> HTML should support Base64-encoded entities to make it easier for
>> authors to include untrusted content in their documents without
>> risking XSS. ?For example,
>>
>> &%SFRNTDUncyA8Y2FudmFzPiBlbGVtZW50IGlzIGF3ZXNvbWUuCg==;
>>
>> would decode to "HTML5's <canvas> element is awesome." ?Notice that
>> the < and > characters get emitted by the parser as character tokens.
>> That means they can't be used by an attacker for XSS. ?These entities
>> can be used safely both in intertag content as well as in attribute
>> values.
>>
>> == Use Case ==
>>
>> Authors often combine trusted and untrusted text into HTML documents.
>> If done naively, an attacker can supply HTML markup, including script,
>> in the untrusted script, resulting in a cross-site script attack.
>> Authors want a way to include untrusted content safely in HTML
>> documents without risking XSS.
>>
>> == Workarounds ==
>>
>> Currently, authors must carefully escape all untrusted content to
>> prevent an attacker from injecting HTML. ?Unfortunately, authors often
>> apply the incorrect escaping or forget to escape entirely, resulting
>> in security vulnerabilities. ?Escaping content in HTML is tricky
>> because authors need to use different escaping rules for different
>> contexts. ?For example, PHP's htmlspecialchars isn't sufficient in the
>> following contexts:
>>
>> <img alt=<?php echo htmlspecialchars($name) ?> src="...">
>>
>> <script>
>> elmt.innerHTML = 'Hi there <?php echo htmlspecialchars($name) ?>.';
>> </script>
>>
>> Some framework convert untrusted content to a series of hex entities,
>> but that greatly increases the length of the content.
>>
>> == Proposal ==
>>
>> We should add a new kind of HTML entity that authors can use to
>> include untrusted content. ?In particular, authors should be able to
>> supply untrusted content in base64, which nicely avoids any scary
>> characters. ?We can avoid clashes with existing or future entities by
>> using a new character after the & escape character. ?In particular, we
>> could use the % character:
>>
>> &%SFRNTDUncyA8Y2FudmFzPiBlbGVtZW50IGlzIGF3ZXNvbWUuCg==;
>>
>> Authors could then supply untrusted content as follows:
>>
>> <img alt=<?php echo htmlescape($name) ?> src="...">
>>
>> where htmlescape is defined as follows:
>>
>> function htmlescape($text) {
>> ?return "&%".base64_encode($text).";";
>> }
>>
>> Adam
>
>
Received on Wednesday, 25 August 2010 14:32:53 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 22 January 2020 16:59:26 UTC