- From: Jim Taylor <JHTaylor@videodiscovery.com>
- Date: Wed, 28 Feb 1996 12:13:53 -0800
- To: www-html@w3.org
I'm working on a more comprehensive version of my proposal for automatic form fill-in. In an effort to better understand the vagaries of field names in forms I undertook a simple survey. I somewhat randomly selected 51 forms from three Alta Vista searches (for "form" in the URL, for "register" in the title, and for "fill out" in the body text). I apologize if the results are annoyingly long, but I think they're illuminating and worth posting (and not as annoyingly long as quoting an entire digest ;-). I grouped the fields by class (name, city, state, etc.), and after each class I included the count and percentage. I also indicated what percentage of the field names were unique. I collapsed duplicate field names to a single line with the frequency in parentheses. Almost all forms (96%) asked for a name, and 90% asked for email. There's a surprisingly high uniqueness of field names, from 40% to 100%, meaning that approximately every other time you visit a form any given field of one class will have a different name than on all the other forms you've visited (uniqueness would be lower with a larger sample, but the point is still valid). The variety of names for a field as straightforward as email is truly amazing. This is all quite unscientific but it shows that trying to identify fields without a consistent identification scheme is a daunting task. And it at least provides some ballpark numbers for determining which fields are common enough to be included in a "global" set of field identifiers. Name: 40 (78%), 43% unique 010.name comments_name custname customer-name feed_name (2) field from_name Full Name invoice_customer_name name (16) OName realname (5) rem_user sales-name thename username your_name (3) First Name: 9 (18%), 67% unique a firstname first name firstname (3) fname FNAME name (2) Middle Name/MIddle Initial: 4 (8%), 100% unique initials mi midname MNAME Last Name: 9 (18%), 67% unique b lastname last name lastname (4) LNAME lname surname Email: 46 (90%), 54% unique 170.email_address comments_email curremail customer_email E-mail e-mail address email (16) email-address emailadd emailadd1 emailAddr (2) email_addr EMAIL_ADDRESS email_address (2) feed_email (2) field From from_address internet l email mail PipeToReplyTo theemail username (4) your_email Title: 7 (14%), 100% unique (also used as title of book) c title current job_name occupation pos title your_title Company: 17 (33%), 59% unique address company (7) company_name compname d company institution_name name (2) ORGANIZATION ORGNAME thecompany Address (1 address field): 21 (41%), 52% unique 090.mailing_address Addr (2) address (9) Address/P.O. Box card_address comments2 e address geographicLoc newnames street (2) street_address Address1 (2 address fields): 11 (22%), 82% unique addr1 address (3) address1 Adress (Line 1) cust-address streetaddress1 street_1 street_address theaddress1 Address1 (2 address fields): 11 (22%), 91% unique addr2 Address address2 (2) Adress (Line 2) cust-suite Street streetaddress2 street_2 street_address_2 theaddress2 City: 30 (59%), 43% unique 100.mailing_city Address card_city city (18) city1 city_name city_state_zip Cty cust-city f city thecity town www_city State: 28 (55%), 50% unique 110.mailing_state Address card_county city city_state_zip cust-state g state region St state (15) State/Prov State/Province state_name thestate Zip/Postal code: 28 (55%), 54% unique 130.mailing_zip Address card_postcode city city_state_zip cust-zip h zip postal_code thezip zip (14) zip-5 Zip/Postal Code Zip/Postal_Code ZIP1 zip_code Country: 22 (43%), 41% unique 120.mailing_country card_country CNTRY country (14) country_name i country nation OCont thecountry Phone (if 1 phone field): 24 (47%), 46% unique 150.current_phone comments_phone cust-tel j phone OPhone OTPhone Ph phone (12) phone_number (3) telephone WKPHPRE+WKPHFIRST+WKPHLAST (3 fields) Phones (if 2 phone fields): 5 (10%), 100% unique dayphone/evephone homephone/workphone Office_Phone/Home_Phone phone/phone2 phone_day/phone_eve Fax: 15 (29%), 40% unique 160.fax fax (10) FAXPRE+FAXFIRST+FAXLAST (3 fields) fax_number k fax thefax Username/Login: (6%) login (2) username Password: 4 (8%) password password/password2 password1/password2 rem_word Payment/Credit Card: 10 (20%) (many complex options, I can do a more detailed summary if there's any interest) Computer Platform: 4 (8%) (many complex options) Other fields not highly specific to a form: Nickname, Department, Courtesy Title (Mr./Ms./Dr./...), Web Page Name, Web Page URL, Social Security Number, Age, Date of Birth, Sex, Religious Preference, Ethnic Background, Country of Citizenship, US Citizen, Year of Graduation, Language Security Note: I am not suggesting that the above information be automatically submitted to every entity that asks for it. I'm merely reporting my findings. A few other notes: - "Atoms" of information are frequently combined (a single field for first name and last name or for city, state, and zip), which indicates a possible need for a fill-in mechanism that can combine them - Context wouldn't help much in identifying fields. I never imagined so many ways to label a field requesting a name! Not to mention foreign languages. - I saw a surprising number of VALUE="" entries. Which aside from being quite bizarre might cause problems for browsers which don't fill in fields if a value is specified. - I discovered that there's a "Comment on ..." form that's been copied to hundreds of Web sites, complete with odd "inPUT" spelling. I reiterate that this is an ad hoc survey, so please don't bother to complain about my math or my methods. __________________________________________________________________ Jim Taylor <mailto:jhtaylor@videodiscovery.com> Director of Information Technology Videodiscovery, Inc. - Multimedia Education for Science and Math Seattle, WA, 206-285-5400, <http://www.videodiscovery.com/vdyweb>
Received on Wednesday, 28 February 1996 15:11:45 UTC