- From: Ryosuke Niwa <ryosuke.niwa@gmail.com>
- Date: Wed, 25 Aug 2010 14:32:05 -0700
Does ECMAScript currently have a built-in function for encoding & decoding base-64? We might want a built-in base-64 encoder / decoder if we are implementing this base64-encoded entities. - Ryosuke On Wed, Aug 25, 2010 at 1:50 PM, Adam Barth <w3c at adambarth.com> wrote: > == Summary == > > HTML should support Base64-encoded entities to make it easier for > authors to include untrusted content in their documents without > risking XSS. For example, > > &%SFRNTDUncyA8Y2FudmFzPiBlbGVtZW50IGlzIGF3ZXNvbWUuCg==; > > would decode to "HTML5's <canvas> element is awesome." Notice that > the < and > characters get emitted by the parser as character tokens. > That means they can't be used by an attacker for XSS. These entities > can be used safely both in intertag content as well as in attribute > values. > > == Use Case == > > Authors often combine trusted and untrusted text into HTML documents. > If done naively, an attacker can supply HTML markup, including script, > in the untrusted script, resulting in a cross-site script attack. > Authors want a way to include untrusted content safely in HTML > documents without risking XSS. > > == Workarounds == > > Currently, authors must carefully escape all untrusted content to > prevent an attacker from injecting HTML. Unfortunately, authors often > apply the incorrect escaping or forget to escape entirely, resulting > in security vulnerabilities. Escaping content in HTML is tricky > because authors need to use different escaping rules for different > contexts. For example, PHP's htmlspecialchars isn't sufficient in the > following contexts: > > <img alt=<?php echo htmlspecialchars($name) ?> src="..."> > > <script> > elmt.innerHTML = 'Hi there <?php echo htmlspecialchars($name) ?>.'; > </script> > > Some framework convert untrusted content to a series of hex entities, > but that greatly increases the length of the content. > > == Proposal == > > We should add a new kind of HTML entity that authors can use to > include untrusted content. In particular, authors should be able to > supply untrusted content in base64, which nicely avoids any scary > characters. We can avoid clashes with existing or future entities by > using a new character after the & escape character. In particular, we > could use the % character: > > &%SFRNTDUncyA8Y2FudmFzPiBlbGVtZW50IGlzIGF3ZXNvbWUuCg==; > > Authors could then supply untrusted content as follows: > > <img alt=<?php echo htmlescape($name) ?> src="..."> > > where htmlescape is defined as follows: > > function htmlescape($text) { > return "&%".base64_encode($text).";"; > } > > Adam > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.whatwg.org/pipermail/whatwg-whatwg.org/attachments/20100825/1e56f8d1/attachment.htm>
Received on Wednesday, 25 August 2010 14:32:05 UTC