Re: Page Security Score proposal from Rachna Dhamija on 2007-06-19 (public-wsc-wg@w3.org from June 2007)

From: Rachna Dhamija <rachna.w3c@gmail.com>
Date: Tue, 19 Jun 2007 15:54:34 -0700
To: "Mary Ellen Zurko" <Mary_Ellen_Zurko@notesdev.ibm.com>
Cc: public-wsc-wg@w3.org
Message-ID: <20abbc510706191554x778794c5k1f744b1ad2e4ccc2@mail.gmail.com>
AFIK there was no usability testing of Spoofguard.  There are certainly
studies that explore how people interpret indicators that represent several
collapsed dimensions (e.g. in the information visualization literature).
If I dig up any that are relevant, I'll send them to the list.

Rachna

On 6/19/07, Mary Ellen Zurko <Mary_Ellen_Zurko@notesdev.ibm.com> wrote:
>
>
> Thanks. At a glance I didn't see anything about usability testing. Was
> there any?
>
>           Mez
>
>
>
>
> *"Rachna Dhamija" <rachna.public@gmail.com>*
> Sent by: public-wsc-wg-request@w3.org
>
> 06/18/2007 08:09 PM
> To
> michael.mccormick@wellsfargo.comcc
> public-wsc-wg@w3.orgSubject
> Re: Page Security Score proposal
>
>
>
>
>
>
>
> Related to this topic, we should be aware of Spoofguard, an IE plugin
> that was developed a few years ago by Boneh and Mitchell's group at
> Stanford.
>
> It analyzes web pages and collapses several heuristics into one
> indicator (a green/yellow/red traffic light).  Users can set the
> weights for each heuristic and some threshold where a warning message
> is displayed.  The heuristics include page visit history, image cache
> history, the number of unencrypted password fields, if the user
> arrived to the page by clicking on an email link and some checks on
> the domain name and URLs in the page.
>
> http://crypto.stanford.edu/SpoofGuard/
>
> I'll add it to the shared bookmarks.
>
> Rachna
>
>
> On Jun 18, 2007, at 3:43 PM, <michael.mccormick@wellsfargo.com> wrote:
>
> Johnathan,
>
> There is admittedly some arbitrariness to the weights I used in my
> scoring formula, but I think if you play with it you'll start to see
> the aggregate scores move up and down in a more or less reasonable
> way, especially considering this was only a straw man formula designed
> to enable discussion.  (Without an actual example formula I was
> concerned this proposal would be too abstract for people to fully
> understand.)
>
> Your point about brittleness is well taken.  I agree the scoring
> formula will have to adapt occasionally to changing technologies as
> new security indicators become available, etc.
>
> I sympathize with your preference to keep all the SCIs separate rather
> than aggregating them to a single gauge.  I'm not proposing the
> individual secondary SCIs shouldn't be available to IT security savvy
> users like you & me.  But I don't accept the premise that ordinary
> users can make sense of them (much less sound risk decisions).
>
> A detailed "hi fi" SCI is obviously superior for those who have the
> training & expertise to use it.  But for everyone else a simple "lo
> fi" SCI is better than none at all.
>
> So the fundamental questions seem to be:
> - Are the many security & risk context indicators we've identified
> (PageSecurityInfo) usable in raw form by typical web users?
> - If not, should agents attempt to distill them down to something
> simple and intuitive -- even if it's low fidelity?
>
> My own answers would be No and Yes, respectively.
>
> Thanks for your thoughtful comments.  Mike
>
> From: Johnathan Nightingale [mailto:johnath@mozilla.com]
> Sent: Friday, June 15, 2007 10:40 AM
> To: Mary Ellen Zurko
> Cc: McCormick, Mike; public-wsc-wg@w3.org
> Subject: Re: Page Security Score proposal
>
> On 15-Jun-07, at 10:16 AM, Mary Ellen Zurko wrote:
>
> I believe we're likely to achieve concensus that there should be some
> primary SCI display (there are accessibility and device
> size/characteristics to be accounted for orthogonally, as well as the
> multicultural aspect raised by Bruno/ANEC; I assume those and do not
> explicitly address them here). To the extent there is a primary SCI
> display, it will have to have some sort of levels or gradations
> (on/off, 3 levels as in "what is a secure page", 4 levels as this
> proposal suggests, 99 levels/gradations as this proposal also
> suggests). No one seems to be proposing something with no levels as a
> primary SCI (that is currently relegated to secondary SCI in PageInfo,
> and rightly so in my opinion). We discussed the issue of medium/high
> risk situations that are pure display (no input) during one of the
> lightening discussions I led, and there seemed to be concensus that
> there would be pure display use cases of medium/high risk data, which
> also points towards concensus around a primary SCI display. Now would
> be the time for any participant to indicate that we did not have
> concensus on the need for recommendations around a primary display of
> SCI which reflects some level or gradation of security that is meant
> to be usable for trust decisions.
>
> So, as a meta point, it seems wrong to me to assume silence on the
> wire vis-a-vis email discussion of a proposal constitutes consensus.
> I don't think that's what you were doing here Mez, because as you
> mentioned, some of this has been discussed in lightning discussion
> (though I think not one I was around for - my bad) but I just wanted
> to throw it out there.  I know that when running a workgroup like
> this, and going through periods of frustrating silence, declaring
> consensus can be a good way of stirring people to action, but I would
> think that for Pass/Fail on individual recommendations, it might get
> us into a situation where people withdraw because things made it
> quietly into the recs that they don't believe in.  I think the
> approach you began on the last call, where we dive into a specific
> recommendation for more detail, and presumably where that culminates
> in people discussing whether it should be in the recs Yes or No, is a
> better way to go.  </meta>
>
> This is me indicating that we do not have consensus on the need for
> recommendations around a primary display of SCI which reflects some
> level of gradation of security that is meant to be usable for trust
> decisions.  :)
>
> During lightning discussions, I obviously didn't want to throw up a
> bunch of stop energy, but I think that trying to aggregate a
> multi-dimensional space of indicators into a single
> number/letter/colour is (with apologies to Yngve and Opera's
> multi-level padlock) the wrong way to go.  Or at very least, I'm not
> yet convinced it's the right way - I'm not shutting the door.
>
> First of all, there's a fundamental arbitrariness to the numbers, as I
> see it.  I know Mike meant his proposal to be a launching off point,
> so I don't want to start a whole debate about the math, but there's
> something intrinsically confusing about the fact that "user has
> visited this page in the past" is worth the same number of points as
> "SSLv1" and that using a local HOSTS file + visiting this page in the
> past is worth the same as a non-AES/3DES cipher suite.  What's odd
> isn't the numbers, or even the equivalencies, it's that these
> comparisons are just category-errors-all-over-the-place.  Even if it
> were meaningful to compare choice-of-cipher-suite to history data
> point for point, I don't think we can have any real confidence that
> those ratios are fixed from user to user.
>
> I also don't know how users are supposed to make decisions based on
> this kind of thing.  Yes, the rec says that they should be able to see
> the equation, but if the idea is to have an at a glance indicator, how
> does it help them make better trust decisions?  Should they shop at a
> 65?  Should they only give their SSN to a 90?  The numbers are opaque
> and in some cases totally misleading.  A site you've been to before,
> using proper SSL is basically fine, will have different numbers for
> different people depending on things like whether they've bookmarked
> it.
>
> Furthermore, this kind of scoring system is brittle in the face of a
> changing security landscape.  Let's imagine we do this, and even that
> browsers standardize on its use.  If a hole is found in DNSSEC next
> year, or AES, we'll have to adjust the scores.  Maybe that's
> containable.  What if some new technology is introduced?  A broadly
> distributed social web-site-trustworthiness service, or for that
> matter Google's anti-phishing/anti-malware lists.  Do we include them?
> If so, we create the possibility of scores greater than 100.  If we
> tweak everything down to accommodate the new data instead, then we
> implicitly downgrade the existing numbers, causing user confusion
> about whether their bank has become less secure.
>
> There are more objections I think I could offer, but this note is
> already getting long, and I don't want to be a critic without anything
> better to offer, so let me try to explain what I'd prefer.  We're
> talking about security *context* here, about a set of cues to help
> users situate themselves better online, and make better decisions as a
> result.  Stores in the real world don't have scores tacked to the
> outside very often, and even when they do (e.g. the health check
> green/yellow/red on restaurants in many cities) they aren't your only
> cue.  You make your decisions based on your own internal weighting of
> the various indicators you have to work with (how full is the
> restaurant, how clean does it look, have you been there before, etc).
> My argument would be that the web equivalent of that is to NOT combine
> multiple indicators, but rather to employ each individually as
> appropriate.  Let people do the thing they do exceptionally well,
> reasoning intuitively based on numerous inputs, and focus on getting
> those inputs to be as meaningful and atomic as possible.  Tell people
> which sites they're visiting with an identity indicator.  Use a
> ritualized chrome supported login process.  Introduce robustness
> countermeasures to prevent chrome spoofing.
>
> Aggregating the various signals into a number/symbol/letter/colour
> isn't creating context, it's lossy, it misses the opportunity to put
> that context out there for the user.  And if there are pieces of
> context information that we argue can't be put in front of users (e.g.
> algorithm selection for SSL) then my argument would be that they
> aren't enabling better trust decisions anyhow.
>
> Cheers,
>
> J
>
> ---
> Johnathan Nightingale
> Human Shield
> johnath@mozilla.com
>
>
>
>
Received on Wednesday, 20 June 2007 05:42:58 UTC