Re: belated remarks on PII Bar from Ian Fette on 2007-09-18 (public-wsc-wg@w3.org from September 2007)

From: Ian Fette <ifette@google.com>
Date: Mon, 17 Sep 2007 18:13:09 -0700
To: "Close, Tyler J." <tyler.close@hp.com>
Cc: public-wsc-wg@w3.org
Message-ID: <bbeaa26f0709171813v5d1b144aw93d127364747e1b9@mail.gmail.com>
I don't have comments on all of this, but I did want to comment on a few
parts. My reply is inline.

On 9/17/07, Close, Tyler J. <tyler.close@hp.com> wrote:
>
>
>
>
> Thomas Roessler wrote:
> > you asked toward the end of the last call what my issue was with the
> > current text on the PII Bar proposal.  It took me a while to figure
> > that out, but actually, I think it's the amount of detail with which
> > the implementation of a form filler is described.  That blurs the
> > additions to form-filling very heavily.
>
> Perhaps a more layered description would be better. I think all the
> details do eventually need to be presented, since I think all the detail
> I included has security implications. These details will also be needed
> for lo-fi prototyping.


I think we also need to be careful not to over-dictate here. You consider
Petname  and PII bar to be improvements to how form-filling works in
browsers. It may be the case that next year, you or someone else comes up
with something better, and it would be a shame if we dictated things so
precisely that to adopt this new "better" method would cause a browser to
become non-compliant. Personally, I much prefer a recommendation that
specifies what needs to be achieved, but leaves the particular method to the
browser implementers.


> I *think* the key points where the proposal deviates most from
> > current practice are these:
> >
> > * form filling is partially tied to a particular "origin", so there
> >   is no autocompletion if site A happens to ask for your SSN the
> >   same way site B does that.  There is, however, the possibility to
> >   select a string interactively that hasn't been shown to site A,
> >   but might previously only have been shown to site B.  This
> >   selection would have a user interaction different from the
> >   autocompletion one.
> >


It's not entirely clear that this is desirable to me. I.e. imagine I'm on
nwa.com booking an airplane ticket from SFO to DTW on 11/20/07 returning
11/24/07. I now want to check Continental, Delta, and perhaps Orbitz to see
if any of the other airlines have a better price. I really don't care that I
haven't explicitly typed that combination of travel arrangements into all of
the sites - if I had a form filler that was capable of matching up the
fields on these sites and inputting the information, I would be elated. For
me, having to click through a list of information submitted at other sites
is absolutely worthless - it's much faster for me to just tab between fields
and re-type it rather than scrolling through lists, potentially using the
mouse, entering petnames, whatever.

> * The "origin" includes whether a site is secure or not; more
> >   precisely, it is currently suggested to have no form filling at
> >   all if a page is served through plain HTTP.  For HTTPS, the
> >   "origin" is expressed in terms of the X.509 certificate used.
>
> This distinction also provides a different interaction for submitting
> information securely versus not securely. With the proposed division we
> have the simple rule that information submitted via the PII bar was
> submitted securely, but information entered into the web page may not
> have been. We have a number of use cases around making this distinction
> clearer for users.


Is this going to break things? i.e. you have javascript on a page doing
validation as a user types... now, if they're typing into pii bar, they do
their thing there, hit "button", it gets automagically pasted into the
field, and now there's some problem at some part of the input that some js
validator function catches. Oops. Now it's not just a matter of re-typing it
into the field, the user actually has to somehow re-invoke pii bar, and edit
stored data or something?

>
> > * if a user is at a site that they haven't been to before, and if
> >   the user behaves as if they want to use the form-filler, then
> >   that's used as a hook to ask the user to (a) give that site a
> >   nick-name, and (b) give the user a way to navigate using their
> >   browsing history.
>
> I think the way the browser walks the user through this workflow is also
> important, providing reasonable options and explaining how to choose
> between them. Full screen messaging, like Serge has had success with,
> should also be used here.
>
> > * for visual user agents, the data entry widget is next to a
> >   passive, identity signal like indicator, to move the user's locus
> >   of attention to this signal.
>
> Unlike all other identity signals, the petname tool is robust against
> homograph and similar attacks.


I think it's a rather strong claim to say that "all" other identity signals
suffer from homograph attacks... also, we've yet to see how this works in
practice in a large-scale deployment.

> * there must be explicit user approval of kinds before information
> >   is filled into form fields and transmitted.
>
> I think this explicit user approval, and other aspects of the
> presentation, also help create a distinction of user agent versus web
> site that browsers have thus far failed to create. Current form fillers
> make it look like the web site already knows the filled field values.
> For example, even during our telecon, I noticed cases where people were
> talking about current form filler use cases that in fact, aren't the
> form filler at all, but functionality provided by the web site.
> Currently, even experts can't tell when they're using their browser's
> form filler versus functionality built into the web page.


Does it really matter from the user's perspective? I would say probably not,
but you want it to. You seem to want to make users aware of the fact that "
badsite.com" doesn't already know your social security number. This is
probably a good goal. On the other hand, there are many cases where it
probably doesn't matter, e.g. if I decide that I want to (for some unknown
reason) use finance.yahoo.com, as soon as I type a G into the ticker symbol,
I'd rather like it if "OOG" were appended given that "GOOG" is the ticker
symbol I look at most often. For this data, I really don't care that Yahoo
hasn't previously seen me looking at Google's stock price.

> * there are "display names" for passwords.  Passwords and user names
> >   seem to be handled independently of each other.
> >
> > * there is an idea in here on handling TLS errors, and generating
> >   e-mail to abuse addresses from WHOIS records. I believe that would
> >   belong elsewhere in the document.
> >
> > * there is a remark about the minimum features to expose to users in
> >   terms of editing form strings.
> >
> > Some comments / observations:
> >
> > * There seems to be a critical assumption in here that people will
> >   be *so* habituated that they will press the "attention key" to
> >   trigger the rest of the interaction, even though they need to deal
> >   with each form field separately.  Is that really warranted?
>
> Testing will settle this question. I'll only note that I'm not just
> banking on habituation, but also laziness. It's easier to click on your
> password than remember it and type it in correctly.


If I'm a typical user, I have a weak password that is easy to type, and I
probably share it amongst all my logins. For me, given that when I go to
many websites they automatically set focus on the login field, it's much
faster for me to just type my username, hit tab, and type my password, hit
enter (without moving my hands from the keyboard, this is a matter of just a
very small number of seconds), vs. having to potentially hit some strange
activation sequence that could involve me having to move my hands (or god
forbid use the mouse). I'm also very habituated to just entering my password
manually.

>   Also, does the separation of form fields get into the way of
> >   "habituation"?  I'd suspect that a very simple interaction in
> >   which I succeed with, ideally, one or two key presses will be
> >   easier to learn (and lead to a stronger irritant) than an
> >   interaction in which I'm navigating all over the side, and
> >   constantly have my locus of attention changed.
>
> If a web site is asking the user to reenter information that the user
> has already given it, something special might be happening. For example,
> the site might be asking for permission to use the information in a new
> way. I think we need to think hard about subverting this interaction by
> automating the user's response.


Or it might just be asking me what airport I want to fly out of. Usually I
fly out of SJC, but I'm often asked to re-enter this. I'm quite happy to
have my form-filler pre-populate SJC and let me change it if I decide I want
to instead fly out of SFO. There's really nothing "special" happening here,
and I really do want my response to be automated. Same thing with my
Worldperks number. You can argue that it's personally identifiable
information (and it is), but when I go to http://www.nwa.com I really like
the fact that my form-filler fills it in.

I think the perverse scenario you imagine in the above paragraph could
> turn out to be rather rare.
>
> Also, repetition is what creates habituation.
>
> > * "Explicit user approval before filling" -- filling the forms
> >   includes generating DOM events.
>
> Not sure what you're asking here.
>
> > * Since HTML forms have a mechanism (type="password") to indicate
> >   that some content shouldn't be displayed on the screen, we'd
> >   probably want to mention that as well.
>
> I'm really nervous about doing any introspection on the document. I'm
> worried the phisher will be able to construct a document that makes it
> look like the PII bar is malfunctioning, and so convince the user to act
> against the advice of the PII bar. For example, for the case you
> suggest, the phisher could make a text field that displays "*"
> characters, using Javascript, and so seems to be a password field, but
> is not treated as one by the PII bar.
>
> Another principle carefully adhered to by the PII bar is to be very
> careful about knowing the source of information you're acting on, or
> displaying to the user. Changing how the tool behaves based on
> information provided by the phisher is dangerous. Places where such a
> dependency exists must be carefully examined, and avoided if possible.
>
> > * On the WHOIS-related part, don't do it.  WHOIS is a complete and
> >   utter mess.  (And you don't want me to even start telling you the
> >   long story about that.)
>
> Why does that matter? All we're doing is sending up a flare to say
> something is wrong. No harm done if no one sees the flare, we're no
> worse off.


1. Registrars will start banning your clients if you start throwing up
requests all over the place.
2. Technically, you'd be violating their terms of service. You might not
care about this, but a corporation who is shipping a browser (Mozilla,
Opera, etc) has a very strong disincentive to ship a product which violates
terms of service. (If you do a whois on ianfette.com you will get the
following terms of use from VeriSign:

TERMS OF USE: You are not authorized to access or query our Whois
database through the use of electronic processes that are high-volume and
automated except as reasonably necessary to register domain names or
modify existing registrations; blah blah blah
3. It's a mess. There are millions of registrars for .com (well, a lot) and
each has their own data format. (You can get general information for most
.com and .net from the servers verisign operates, but for the detailed
information you get a pointer to the specific registrar, you query that
registrar, and you essentially get back a free-text blob that can be
whatever format the registrar feels like giving you.) International whois
servers are even more fun (i.e. you have to pass magic parameters to DENIC
to get a whois to work, you get back a different set of information, and
some countries even use different language abbreviations in fields like
month of registration.)

Basically, when doing a whois on a random domain, you basically get back a
random blob of free text. Writing a parser for these is a nightmare. I've
done it. It sucked, and it still didn't give me perfect data in the end.
Totally not worth it. Trust TLR here.

> * On the minimum features to edit the local history, I wonder how
> >   much of this can be safely left to implementers.  I suspect most.
>
> I'm skeptical. To date, browser implementers haven't shown good
> judgment, or we wouldn't be here now. At the very least, I think
> important distinctions and pitfalls must be documented.


Again, raising the over-specification flag

> Alternative proposal, for the sake of argument:
> >
> > * We focus on login interactions only.
>
> I'm not interested in pursuing such a proposal. I'll be putting my
> effort into prototyping a solution that also protects credit card
> numbers and other security-sensitive identifiers. Perhaps we'll revisit
> this issue if early testing shows this goal to be less achievable than I
> hope it is.


Well, some people are interested in a form filler that is less obtrusive.
Especially given that you're trying to dictate that yours should be the only
form filler in a browser (your text bans other competing form fillers,
unclear if this affects users of toolbars), I think it's at least worth
considering alternatives and trying to do a cost-benefit analysis.

Thanks the feedback.
>
> --Tyler
>
>
Ian
Received on Tuesday, 18 September 2007 01:13:31 UTC