[whatwg] Autocomplete and autofill features and feedback thereon from Ian Hickson on 2012-11-21 (public-whatwg-archive@w3.org from November 2012)

From: Ian Hickson <ian@hixie.ch>
Date: Wed, 21 Nov 2012 01:17:10 +0000 (UTC)
To: whatwg <whatwg@whatwg.org>
Message-ID: <Pine.LNX.4.64.1211202332400.26354@ps20323.dreamhostps.com>
On Tue, 16 Oct 2012, Ilya Sherman wrote:
> On Thu, Aug 2, 2012 at 11:42 AM, Ian Hickson <ian@hixie.ch> wrote:
> > On Mon, 23 Jul 2012, Ian Hickson wrote:
> > >
> > > So we could define the autocomplete="" field's value as follows: 
> > > [...]
> >
> > I've now specced this, with some minor changes.
> 
> My only high-level question is: Why did you choose to drop the proposed 
> aliases like "city" for "locality" and "province" for "region"?

The short answer is that I wanted to avoid redundant terms because aliases 
tend to cause all kinds of issues (people set both because they don't know 
which to use, or they set both thinking they're separate and they end up 
conflicting, or they start long threads on mailing lists asking which one 
is the most appropriate one...).

I did do quite some research to figure out which term to use. In the end, 
mostly on Tantek's encouragement, I ended up picking the terms that are 
the most inclusive and happen to be the terms used by vocabularies such as 
those based on vCard, like hCard/adr.


> While "locality" and "region" are probably the most technically correct 
> terms -- they're certainly the best that I found while researching -- 
> they're not terms that I'd expect most web developers to be familiar 
> with.

That's true, certainly, but I think consistency with things like hCard 
will somewhat mitigate that problem.


> I think including the proposed aliases allows for a more "natural" way 
> to express many site's forms; and I think that more natural/readable 
> source HTML code is a Good Thing™.

I understand the attraction of having redundant terms, but the cost is 
pretty high (as described in the parenthetical above). I'm not sure it's 
worth it.

For "provice" and "region", I think they're a wash. In the US, "state" is 
the familiar term; in the UK, "county", in Switzerland, "canton"... each 
place has their own, I don't think it really matters what term we use, 
but supporting all of them would be highly confusing.

As for "city", one problem specific to this value is that it encourages 
authors to forget that many of their users aren't in cities at all. I wish 
there was a better term than "locality", though, I agree.


> Otherwise, a bunch of minor typos and the like, all related to the parsing
> algorithm and subsequent sections:
> * In step 13.3, "hint set" should be "hint tokens".
> * It seems like step 13.6 should precede step 13.5.
> * In step 14.3, "hint set" should be "hint tokens".
> * In step 14.3, "contact" should be "mode".
> * It seems like step 14.6 should precede step 14.5.
> * In the paragraph beginning with "Suppose a user agent knows of two phone
> numbers", there is a typo: "pefilled" -> "prefilled".

Fixed.


> * In step 14.4, I think "either is" is more natural than "either be".

Fixed to "will either be", which was the intent (I deleted too many words 
when copy/pasting the previous block).


> * Step 18 is the last mention of the "scope tokens" data in the parsing
> algorithm, as well as in the subsequent commentary.  What is the intended
> function of the scope tokens -- should they be combined with the hint set,
> or is there a separate notion of scope that should be invoked by the UA
> when parsing this attribute?

Wow, I totally dropped the scope thing on the floor. Oops. Fixed.


> * For terms like "autofill hint set", should the spec use "autocomplete" 
> rather than "autofill", or is there an intentional distinction being 
> made here?

"Autocomplete" is the feature wherein the user types half of some piece of 
data and the user agent provides the rest. "Autofill" is the feature 
wherein the user selects from previously-provided data and the user agent 
provides all the data based on that selection.

The fact that the HTML spec uses the "autocomplete" attribute to control 
the autofill feature is a historical accident. It's similar to how the 
feature for providing alternative style sheets is rel="alternate", even 
though the word "alternate" is a non-sequitur in this context.


> > > So instead of <input type=tel autocomplete="work tel"> you would 
> > > just say <input type=tel autocomplete=work> (and would not be able 
> > > to say <input type=text autocomplete="work tel">, which would be an 
> > > inferior user experience when tel is given special behavior, or 
> > > <input type=email autocomplete="work tel">, which would be 
> > > inconsistent).
> >
> > I'm a little wary about adding more magic here, these attributes are 
> > already pretty complicated. See the autocomplete section's algorithms 
> > and let me know if you still think we should do something along those 
> > lines. If it's something people are willing to implement, I wouldn't 
> > want to stand in the way; I agree that it has some good side-effects 
> > (like making it impossible to have certain combinations).
> >
> > I could also introduce some conformance requirements to make the bogus 
> > combinations non-conforming; currently I haven't made type=tel 
> > autocomplete=email non-conforming for instance.
> 
> Since the autocomplete type hints are just hints, I think it's ok to 
> leave this behavior undefined

Ok. Currently the spec says that autocomplete="work" should be treated the 
same as autocomplete="on", which is basically "do magic" (i.e. undefined, 
as you say).


> but I also don't see any problem with making such mismatches 
> non-conforming, other than that makes the spec even longer/more verbose.

That's not a completely trivial problem. :-)


> > I haven't added this [government ID numbers].
> >
> > I also haven't added:
> >  - payment instrument type
> >  - payment instrument start date
> >  - payment instrument issue number (for Maestro)
> >
> > I also haven't removed, as some people suggested, the three cc-name 
> > subfields.
> >
> > I'm open to making all these changes, but figured I would get some 
> > more input on them first, in particular from Ilya who did the research 
> > to come up with the original set of fields.
> 
> I have seen a relatively high number of Chrome bug reports requesting 
> better handling of (e.g. government) ID numbers.  One example: [ 
> http://code.google.com/p/chromium/issues/detail?id=64433 ].

The better handling requested there is just for Chrome to not do any 
special handling. So no need to add anything to the spec for that.


> I think it would be helpful to add these to the spec; though as 
> subsequent posters have noted, there's a lot of potential complexity in 
> how these should be represented.  This might fall under the broader 
> class of "identity"-related fields, which I think merit their own 
> carefully thought out set of tokens. There was some work done on the 
> beginnings of such a specification -- see 
> https://wiki.mozilla.org/Identity-inputs -- but my current understanding 
> is that this is an area in need of further development.

I'm happy to add more things like this to the spec, but I don't know what 
to add exactly. If there is a concrete description of what fields I should 
add here, I'd be happy to do so.


> The payment instrument type is almost certainly appropriate to add -- it 
> is included on almost every website that I've encountered that includes 
> payment card fields.  It was an oversight on my part to omit it from the 
> initial proposal.

It's redundant data, the credit card number itself says what type it is.

More importantly, I don't know how to store the information. What values 
should we be expecting here? If a site has radio buttons "v", "m" and "a", 
and another has a <select> with "4", "5", and "3", and yet another has 
three buttons which set a type=hidden with the values "visa", "mastercard" 
and "amex", how is the user agent to figure out what's going on? This 
makes the magic needed around dates look positively easy.


> The other two payment instrument field types I haven't encountered on the
> Web, as far as I can recall.  So, based on my data set accumulated while
> working on Chrome Autofill, I'm ok with leaving these out of the spec for
> now.  However, my experience is biased toward US websites; it's possible
> that these fields are more prominent internationally.

I did some more research and it seems start date and issue number are 
specific to the old Switch and Solo cards (for a while branded Maestro, 
though they weren't technically Maestro cards as the rest of the world 
knows them). These seem to be obsolete now.


> The three cc-name subfields are split out surprisingly often on existing 
> websites.  I was initially opposed to including these in the spec; but 
> that data in support of them was overwhelming.

I've left these.


> Finally, I have gotten a request to include a field type for bank 
> account numbers, though I have only seen this info actually requested on 
> a small handful of extremely prominent (and generally trusted) websites: 
> Amazon, PayPal, and I think Google Wallet.

Is there any reason we shouldn't just treat bank accounts like just 
another credit card, and use cc-number for these?


On 26 Oct 2012, Elliott Sprehn wrote:
> 
> Several of us on the Chrome team have been thinking about the challenges 
> of filling out long forms full of personal information. We've noticed 
> that site authors split up their forms across multiple pages to avoid 
> overwhelming their users with one single massive form [1]. This is 
> particularly bad on mobile where we've observed some popular retailers 
> splitting their forms into six or more pages in an attempt to optimize 
> their flow for a small screen. This unfortunately defeats many of the 
> advantages of existing browser autocomplete.
> 
> In researching this we�ve found that with a few changes built on the 
> existing HTML autocomplete spec [2] we can allow authors to recombine 
> their forms and enable browsers to provide more useful autocomplete.
> 
> 1) HTMLFormElement.prototype.requestAutocomplete()
> Asks the user agent to asynchronously present a UI for performing full
> <form> autocomplete using the already spec�ed autocomplete attributes
> [2] used in the form. In concept this is very similar to prompt()
> except the UA could provide a streamlined experience for filling in
> information in large chunks. For example you could imagine choosing a
> shipping address from a drop down instead of presenting multiple
> inputs.
>
> 2) Simple event named �autocomplete�
> This event is dispatched on the form element after the UI presented by
> requestAutocomplete() is closed if the form validates after having
> filled each input and firing all necessary input events like �change�.
> 
> 3) Simple event named �autocompletecancel�
> This event is dispatched on the form if the UI is canceled.

This seems reasonable. I recommend implementing it (without a prefix); if, 
based on your implementation experience, other browser vendors want to 
implement it, we would then add it to the spec.

Getting positive indications from a second browser vendor that they want 
to implement this would be the next step towards getting this in the spec.


On Fri, 26 Oct 2012, Anne van Kesteren wrote:
> 
> I'm missing the scenario that requires such interference from a web 
> developer. Can't a UA just offer to autocomplete a form for me once it 
> finds one? (Or in other words, unless I'm missing something this seems 
> like a solution without a provided use case.)

One-page apps don't have a relevant onload for the UA to use. In 
high-latency environments you _really_ want to minimise page loads.


On Wed, 31 Oct 2012, Dan Beam wrote:
> 
> The experimental implementation [1] has been updated to dispatch an 
> "autocompleteerror" as per convention/your feedback.

"autocompleteerror" seems like it'd be fired for an error, not user 
cancelation. User cancelation is usually called either "abort" or 
"cancel". I think autocompletecancel is fine. It's consistent with 
oncancel, which we used for <dialog>. (Fullscreen's "error" event is for a 
slightly different case, based on what the spec says.)


On Sat, 10 Nov 2012, Maciej Stachowiak wrote:
> 
> (1) If this API fills in a form completely based on stored data, and not 
> by completing the user's typing, then it is "autofill" rather than 
> "autocomplete".

Indeed, but the terminology in the API should probably remain consistent 
with the existing attributes.


> (2) If this API provides the ability to get user information without 
> even having a visible form, then it's not clear that it is even really 
> autofill. It's just "requestUserInformation()" or something.

It's intended to fill a form; whether the form is visible or not is 
somewhat academic, the author (who sees the API) still sees the form.


> (3) This API has important privacy and even security considerations. You 
> have to tell the user exactly what you are going to fill in to the site 
> before they approve. Unfortunately, most won't read it. If sites are 
> asking for so much info that they have to split pages for usability, 
> then it seems likely the UI that tells the user what the site is asking 
> for will be impractical for most users to meaningfully review. This 
> becomes especially dangerous if the mechanism can fill in credit card 
> info. I would be very nervous if the browser could at any moment pop up 
> a dialog that would submit all my credit card info to a dubious site if 
> I slip and click the wrong button.

Certainly the UI needs to be made clear, but at some point, we have to 
either let the device submit the credit card information, or have the user 
type it manually. I can tell you from personal experience that I'd rather 
never have to do the latter.


> (4) Should this API be limited to initiation during a user interaction? 
> That would reduce the ability of sites to spam the user with such forms.

That seems reasonable.

-- 
Ian Hickson               U+1047E                )\._.,--....,'``.    fL
http://ln.hixie.ch/       U+263A                /,   _.. \   _\  ;`._ ,.
Things that are impossible just take longer.   `._.-(,_..'--(,_..'`-.;.'
Received on Wednesday, 21 November 2012 02:34:44 UTC