Re: E4H and constructing DOMs

[resending because I dropped public..., sorry]


2013/3/12 Tab Atkins Jr. <jackalmage@gmail.com>:
> On Tue, Mar 12, 2013 at 5:07 PM, Mike Samuel <mikesamuel@gmail.com> wrote:
>> 2013/3/12 Ian Hickson <ian@hixie.ch>:
>>> On Tue, 12 Mar 2013, Mike Samuel wrote:
>>>> I am merely proposing string templates which enable, among other things,
>>>> easy integration of contextually auto-escaped applications.
>>>
>>> Not compile-time checked ones, right?
>>
>> No.  EcmaScript isn't compiled, so I don't know where compile-time
>> checks would go.
>
> ES is, in fact, compiled in every major engine.  But the more correct
> term is "parse-time" anyway, I suppose - an early error that occurs as
> the code is being read, rather than waiting for it to execute before
> failing (or just silently corrupting).

Are you talking about a Chapter 16 "An implementation must treat any
instance of the following kinds of errors as an early error:"?

Lots of libraries would benefit from an ability to cause code to run
on module startup to do consistency checks, but this feature alone
does not seem the kind of thing that justifies standardizing early
before we have any data on how E4H would be used and have understood
the guarantees that it seeks to provide and decided that they're
provided and worthwhile to provide.


>> HTML attribute quoting is a source of subtle XSS vulnerabilities, so
>> unquoted and backtick-quoted attributes are a corner case.
>
> Luckily, it need not have any such security concerns in E4H, as the
> parser can still tell that it's in an attribute-value context and
> escape appropriately.  Since it's creating a DOM rather than text,
> what quoting style you use is immediately lost anyway.

True, except that E4H can't do the kind of fixup that prevents
innerHTML bugs from causing XSS when a DOM tree is serialized and
reparsed, e.g. ensuring that any attribute that contains a backtick
also contains a space character.  Ignoring that, nothing in this case
distinguishes it from a well-written string producing library.



>> If E4H s advertised as embedded HTML, but <><isindex></> compiles and
>> behaves markedly differently than the equivalent HTML fragment, then
>> doesn't it fail your understandability requirement.
>
> Ian's point is that <isindex> itself fails the understandability
> requirement - it's completely insane and not used by any modern pages.
>  Getting some bizarro unexpected result out of it (because nobody
> actually understands what it expands into) just to match the HTML
> parser is potentially worse than doing the obvious thing which happens
> to be different than the HTML parser.

Again, Ian held this up as a case that distinguished E4H from string
producing libraries.  If, as we both agree, template authors don't
write <isindex> then who cares -- the corner-cases on which we should
focus are those which template authors produce.

(This assumes that our templates are written by naive but trustworthy
authors, which is an explicit part of my attack scenario, but Ian may
not assume that -- we just won't know until Ian states his
assumptions.)

>> I'm not just engaging in armchair philosophy about something that I've
>> speced and have yet to deploy.  I have real-world experience with
>> converting large projects to use this that proves you wrong.
>
> Ian provided several examples of code where it seems like it would be
> impossible to auto-escape properly, and an author relying on
> auto-escaping because they've been trained that it works elsewhere
> could be easily misled and inadvertently cause an XSS vulnerability.
> Could you go over those and answer how you think your ideas for
> auto-escaping would address the problems he raised?

Will do in a separate thread since it's off-topic here.

Received on Thursday, 14 March 2013 20:30:35 UTC