W3C home > Mailing lists > Public > whatwg@whatwg.org > March 2010

[whatwg] XSS safe templating

From: Mike Samuel <mikesamuel@gmail.com>
Date: Wed, 10 Mar 2010 09:45:13 -0800
Message-ID: <178b8d441003100945o1f4fb0c0qc61655cc7d1de8e8@mail.gmail.com>
2010/3/10 Henri Sivonen <hsivonen at iki.fi>:
> "Mike Samuel" <mikesamuel at gmail.com> wrote:
>
>> I'm working with EcmaScript TC39 trying to allow for experimentation
>> with new content generation techniques in JavaScript.
>> There's one missing piece which would let template language authors
>> experiment with varying degrees of XSS-safety, and I was hoping that
>> a
>> change like the below might make it into HTML5.
>
> Shouldn't XSS-safe templating use the DOM APIs to generate a tree (fragment) instead of trusting the built-in HTML parser of the browser to behave in a certain way?

That's one way to do it.
But most programmers, by default, write code like this:
   html += '<b>' + foo + '</b>';
Many JS programmers come from a perl/python/ruby/PHP background, and
it looks like there will be changes to ES harmony to allow a familiar
syntax like
   html += s{{<b>$foo</b>}};
where the result ends up being a structured interpolation
   (('literal "<b>") ('substitution foo) ('literal "</b>"))
then with the appropriate context provided to toString, this can be
converted to a "safe" chunk of HTML that preserves the property that
literal portions are interpreted the same regardless of substitution
values.

So we want to move from inherently unsafe idioms to ones that are
safer, letting library authors experiment with the right mix of
strictness and strategies for resolving ambiguity, without imposing an
extra syntactic burden.


>> When user-code does
>> ? ?document.write(value), myElement.innerHTML = value, etc.
>> and the value is an object, currently it is coerced to a string by
>> indirectly calling the toString method. ?I would like the toString
>> method to be called with 'html ' + the current HTML 5 insertion mode
>> to give structured template return values a chance to apply
>> appropriate escaping schemes. ?For attribute sets, it would be nice
>> to
>> call toString with the argument 'attr ' + attribute name. ?This would
>> be backwards compatible as toString implementations ignore parameters
>> (modulo Number).
>
> What would the object do with this information? Without knowing how you are planning on using this information and filling in the lack of information with my own guesses, my knee jerk reaction is very negative.

I think having a knee-jerk reaction to vague, poorly specified
proposals is probably a good thing :)
I didn't want to dump a whole lot of detail on the list on my first
post, but I can put together some runnable demos if that would help.
I've already got the JS side of things speced out.
Would that help?

Or I can try and do a bit of draft-spec writing, but I don't
understand all the implications of changing operations that take
DOMStrings to accept objects and so would probably make a hash of it.

> FWIW, in Gecko currently, the stringification happens a few abstraction layers away from the parser, so implementing your suggestion would involve punching holes in those abstractions.

Ah, so there's a layer that sits between the XPCOM object and the JS
Host object that knows a DOMString is expected, and does the JS foo
necessary to convert to a string?

> --
> Henri Sivonen
> hsivonen at iki.fi
> http://hsivonen.iki.fi/
>
Received on Wednesday, 10 March 2010 09:45:13 UTC

This archive was generated by hypermail 2.4.0 : Wednesday, 22 January 2020 16:59:21 UTC