RE: Page Security Score proposal

Johnathan,
 
There is admittedly some arbitrariness to the weights I used in my
scoring formula, but I think if you play with it you'll start to see the
aggregate scores move up and down in a more or less reasonable way,
especially considering this was only a straw man formula designed to
enable discussion.  (Without an actual example formula I was concerned
this proposal would be too abstract for people to fully understand.)
 
Your point about brittleness is well taken.  I agree the scoring formula
will have to adapt occasionally to changing technologies as new security
indicators become available, etc.
 
I sympathize with your preference to keep all the SCIs separate rather
than aggregating them to a single gauge.  I'm not proposing the
individual secondary SCIs shouldn't be available to IT security savvy
users like you & me.  But I don't accept the premise that ordinary users
can make sense of them (much less sound risk decisions).
 
A detailed "hi fi" SCI is obviously superior for those who have the
training & expertise to use it.  But for everyone else a simple "lo fi"
SCI is better than none at all.
 
So the fundamental questions seem to be:
 - Are the many security & risk context indicators we've identified
(PageSecurityInfo) usable in raw form by typical web users?
 - If not, should agents attempt to distill them down to something
simple and intuitive -- even if it's low fidelity?
 
My own answers would be No and Yes, respectively.
 
Thanks for your thoughtful comments.  Mike

  _____  

From: Johnathan Nightingale [mailto:johnath@mozilla.com] 
Sent: Friday, June 15, 2007 10:40 AM
To: Mary Ellen Zurko
Cc: McCormick, Mike; public-wsc-wg@w3.org
Subject: Re: Page Security Score proposal


On 15-Jun-07, at 10:16 AM, Mary Ellen Zurko wrote:


	I believe we're likely to achieve concensus that there should be
some primary SCI display (there are accessibility and device
size/characteristics to be accounted for orthogonally, as well as the
multicultural aspect raised by Bruno/ANEC; I assume those and do not
explicitly address them here). To the extent there is a primary SCI
display, it will have to have some sort of levels or gradations (on/off,
3 levels as in "what is a secure page", 4 levels as this proposal
suggests, 99 levels/gradations as this proposal also suggests). No one
seems to be proposing something with no levels as a primary SCI (that is
currently relegated to secondary SCI in PageInfo, and rightly so in my
opinion). We discussed the issue of medium/high risk situations that are
pure display (no input) during one of the lightening discussions I led,
and there seemed to be concensus that there would be pure display use
cases of medium/high risk data, which also points towards concensus
around a primary SCI display. Now would be the time for any participant
to indicate that we did not have concensus on the need for
recommendations around a primary display of SCI which reflects some
level or gradation of security that is meant to be usable for trust
decisions. 


So, as a meta point, it seems wrong to me to assume silence on the wire
vis-a-vis email discussion of a proposal constitutes consensus.  I don't
think that's what you were doing here Mez, because as you mentioned,
some of this has been discussed in lightning discussion (though I think
not one I was around for - my bad) but I just wanted to throw it out
there.  I know that when running a workgroup like this, and going
through periods of frustrating silence, declaring consensus can be a
good way of stirring people to action, but I would think that for
Pass/Fail on individual recommendations, it might get us into a
situation where people withdraw because things made it quietly into the
recs that they don't believe in.  I think the approach you began on the
last call, where we dive into a specific recommendation for more detail,
and presumably where that culminates in people discussing whether it
should be in the recs Yes or No, is a better way to go.  </meta>

This is me indicating that we do not have consensus on the need for
recommendations around a primary display of SCI which reflects some
level of gradation of security that is meant to be usable for trust
decisions.  :)

During lightning discussions, I obviously didn't want to throw up a
bunch of stop energy, but I think that trying to aggregate a
multi-dimensional space of indicators into a single number/letter/colour
is (with apologies to Yngve and Opera's multi-level padlock) the wrong
way to go.  Or at very least, I'm not yet convinced it's the right way -
I'm not shutting the door.

First of all, there's a fundamental arbitrariness to the numbers, as I
see it.  I know Mike meant his proposal to be a launching off point, so
I don't want to start a whole debate about the math, but there's
something intrinsically confusing about the fact that "user has visited
this page in the past" is worth the same number of points as "SSLv1" and
that using a local HOSTS file + visiting this page in the past is worth
the same as a non-AES/3DES cipher suite.  What's odd isn't the numbers,
or even the equivalencies, it's that these comparisons are just
category-errors-all-over-the-place.  Even if it were meaningful to
compare choice-of-cipher-suite to history data point for point, I don't
think we can have any real confidence that those ratios are fixed from
user to user.

I also don't know how users are supposed to make decisions based on this
kind of thing.  Yes, the rec says that they should be able to see the
equation, but if the idea is to have an at a glance indicator, how does
it help them make better trust decisions?  Should they shop at a 65?
Should they only give their SSN to a 90?  The numbers are opaque and in
some cases totally misleading.  A site you've been to before, using
proper SSL is basically fine, will have different numbers for different
people depending on things like whether they've bookmarked it.  

Furthermore, this kind of scoring system is brittle in the face of a
changing security landscape.  Let's imagine we do this, and even that
browsers standardize on its use.  If a hole is found in DNSSEC next
year, or AES, we'll have to adjust the scores.  Maybe that's
containable.  What if some new technology is introduced?  A broadly
distributed social web-site-trustworthiness service, or for that matter
Google's anti-phishing/anti-malware lists.  Do we include them?  If so,
we create the possibility of scores greater than 100.  If we tweak
everything down to accommodate the new data instead, then we implicitly
downgrade the existing numbers, causing user confusion  about whether
their bank has become less secure.  

There are more objections I think I could offer, but this note is
already getting long, and I don't want to be a critic without anything
better to offer, so let me try to explain what I'd prefer.  We're
talking about security *context* here, about a set of cues to help users
situate themselves better online, and make better decisions as a result.
Stores in the real world don't have scores tacked to the outside very
often, and even when they do (e.g. the health check green/yellow/red on
restaurants in many cities) they aren't your only cue.  You make your
decisions based on your own internal weighting of the various indicators
you have to work with (how full is the restaurant, how clean does it
look, have you been there before, etc).  My argument would be that the
web equivalent of that is to NOT combine multiple indicators, but rather
to employ each individually as appropriate.  Let people do the thing
they do exceptionally well, reasoning intuitively based on numerous
inputs, and focus on getting those inputs to be as meaningful and atomic
as possible.  Tell people which sites they're visiting with an identity
indicator.  Use a ritualized chrome supported login process.  Introduce
robustness countermeasures to prevent chrome spoofing.  

Aggregating the various signals into a number/symbol/letter/colour isn't
creating context, it's lossy, it misses the opportunity to put that
context out there for the user.  And if there are pieces of context
information that we argue can't be put in front of users (e.g. algorithm
selection for SSL) then my argument would be that they aren't enabling
better trust decisions anyhow.

Cheers,

J

---
Johnathan Nightingale
Human Shield
johnath@mozilla.com

Received on Monday, 18 June 2007 22:44:25 UTC