- From: Brian Behlendorf <brian@organic.com>
- Date: Wed, 28 Feb 1996 02:11:24 -0800 (PST)
- To: www-talk@w3.org
This is a continuation of a thread which has been raging on www-html over the last few days, and since I believe that the answer belongs at a different level than HTML, I'm continuing it here. Feel free to see its beginnings at <URL:http://www.eit.com/www.lists/www-html.1996q1/index.html> under "Automatic Entry and Forms", at least when the archive gets updated since it doesn't appear to be automatic.... if someone has a better pointer please post it. A particular user posted a complaint about having to constantly re-enter his name and password into HTML forms, and wondered whether some sort of automatic form entry system could be developed for common fields. A lively debate ensued as to whether such a system could be built to respect privacy, what privacy means, etc. I contend that information of a private matter - your name, your email address, your zip code, etc - has real value to the party you would give it to, and thus, it should be incorporated into the payments and authentication layer rather than the application layer. Information is currency, it has a value. Content providers should assume no right to be able to detect information about their visitors surreptitiously - but likewise, the expectation that consumers have "imminent domain" to content on the web is unrealistic. If we want to create a world-wide *web* (instead of a world-wide-hash-table, like we have now) we really should facilitate the giving of information (, payments, credentials) from client to server instead of just the other way around. What the original poster was complaining about, I believe, was that he is quite happy and willing to give his personal name and email address to most places which ask for it, in exchange for obtaining something of value, but having to constantly type it in by hand is a pain. Similarly, many content providers have run into usability problems when they try and ask for more information from their users - today CPs can prevent access to a site for people who don't "register", but those who have implemented such a system (like the one I built at hotwired) find that the process of filling out the form and remembering a name and password can be daunting and unnatural to the average user, in addition to be inscalable and encouraging fraudulant information sharing. In most cases, the individual is happy to give the necessary information, it's the process that is daunting. So, we have a set of users who want to be able to always give some types of information automatically, and some types of CPs who want to make it easy for people to give that information. I'm specifically interested in enabling the following scenario: a certain content provider makes their content available for free, advertiser supported, on the following condition: you give the CP your zip code and country code, and only your zip code and country code. The CP uses this to give their advertiser information about the audience. Combined with databases mapping zip codes to demographic data, the CP can easily determine what age ranges the site appeals to, average income, etc, all of which makes for a happy advertiser (who can get some assurances that their dollars are going towards the right market) and a more informed content provider. The most important thing is that this information is only usable in aggregate form - I'll get into this later. What this suggests is that users have access to a small set of common bits of information about themselves, and are able to set a small set of different policies regarding the level of "automatic-ness" to which the information is given. For example, to take the list from Dan Connolly's revision of his Business-Card Authentication proposal <URL:http://www.w3.org/pub/WWW/Demographics/Proposals.html>: > profile-full-name > profile-first-name > profile-last-name > profile-email-address > profile-home-url > profile-affiliation > profile-affiliation-url > profile-postal-street > profile-postal-street-2 > profile-postal-city > profile-postal-state > profile-postal-zip > profile-business-phone (to which I'd add:) profile-postal-country profile-age profile-age-dec (for those who'd rather say they were in their 50's than they were 53) So, let's say the scenario I postulated above is common, and I as a user consider my zip code relatively non-private; it's personal information, since it's about me, but it's not private information, since it's public knowlege. Thus, I have no problem always giving that information out to *anyone* who wants it, if it means that I will be able to obtain information I wouldn't be able to otherwise for free. However, I might feel differently about my email address - while I want to give that out on a fairly regular basis (I'm a public person, I enjoy meeting new people, etc) but I don't want it to be given out to everyone who asks, I want to be prompted when it's asked for by a particular entity. Thus, this profile is actually a matrix, of variables and policies. For usability reasons it makes sense to keep both dimensions as small as possible. I envision just two policies right now - always give when asked, and give with prompting. "Never give" is the same as not filling in the entry in the profile. This is *purely* a client issue - from a protocol perspective, the server simply tells the client it needs that info in exchange for a particular resource. The specification should also strongly suggest that clients have this tunable by system- or network-wide configuration files, just as Java capabilities and policies will hopefully be. So here's how something like that might work: using the Authorization: header, much like Dan's anonymous auth proposal (also at <URL:http://www.w3.org/pub/WWW/Demographics/Proposals.html>). I.e. WWW-Authenticate: profile profile-postal-zip to which the client responds Authorization: profile profile-postal-zip=94107 However, this proposal defeats caching, since caches can not cache the result of an "authenticated" request. It may make sense to use another header for this purpose, one which the proxies may accumulate and transfer to the origin server in bulk later on. Another reason to use a different header is that the server might want to be able to ask for it on a voluntary basis, not a mandatory one. The privacy implications from a client software perspective are easy to address - no information goes out unapproved, just as I would expect a browser with integrated "wallet" to not send my cash and CC number to every site that asked for it. In other words, since the UI issues of user authorization of release of information are already being address, this can "piggyback" on top of that, and hopefully use the same interface, so users can really feel in control of to whom their information flows. The privacy implications on the server side are admittedly murkier, but I content no murkier than we currently have. Today, content providers can throw up authentication on their sites and restrict access to only those users who give them personal information anyways, through clunky HTML forms and names and passwords. Thus, the use of data in that instance is the same as we would have if submission of that information were automated - it is up to the ethical policies of the content provider to determine what gets done with that information. I will contend that no protocol can enforce policies once a content provider has personal information, and that our best hope in this area is to see adoption of data privacy laws similar to those in Europe. Technology can not solve all of society's problems, not even cryptography. :) Should this database of information be available to applications? I.e., should we have a mechanism for auto-inclusion in forms, for Java apps to access that information, etc? I would cede this as a possibility but *only* if the same policies would apply about when to automatically give certain information and when to prompt. For example, a well-designed forms implementation could detect the presence of the "profile-email-address" anywhere in the <FORM> tags and handle that as per the policies - same thing with a Java app accessing applet.browser.profile.email-address or whatever namespace Java apps will use for client-side resources like that. Certainly there is danger from badly designed applications, but not necessarily from malicious network objects, which is the real concerns, since badly designed applications get CERT warnings and NYTimes stories. So, for HTML forms, there can be a direct mapping between the variable names and SGML entities, such that "&profile-email-address;" will insert my address. This will be useful elsewhere, too - imagine being able to put a "Good morning, &profile-first-name;!" at the top of a page. In that case security restrictions can be avoided since that content's not going back to a server somewhere. Note that the specific application I outlined way back is just one thing that could be enabled using this mechanism - if say a user were to make their zip code available for all, then a visit to www.bigbook.com could place them square at their neighborhood for their first search, instead of having to hunt for it like now. I am interested in holding discussion on this idea if we can keep it civil and make sure that issues of privacy, security, and functionality are discussed at levels respecting the state of things as they stand today, and how we can make certain parts of the system better without making other parts of the system worse. I will put my heart on my sleave and state that yes, I am in the "webvertising" business, though we are more interested in creating compelling content, and getting paid for it, than we are in selling jeans or toothpaste. I sincerely hope people agree that pulling this functionality out of the application layer into the payments/authorization layer makes sense. Brian --=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-- brian@organic.com brian@hyperreal.com http://www.[hyperreal,organic].com/
Received on Wednesday, 28 February 1996 05:08:13 UTC