WebApp API Minimization - feedback from Nathan on 2011-02-03 (www-tag@w3.org from February 2011)

From: Nathan <nathan@webr3.org>
Date: Thu, 03 Feb 2011 09:24:17 +0000
To: "www-tag@w3.org" <www-tag@w3.org>
CC: "Appelquist, Daniel, VF-Group" <Daniel.Appelquist@vodafone.com>
Message-ID: <4D4A7441.9030201@webr3.org>
Hi Dan,

Just a few editorial notes on the WebApp API Minimization doc, great
work btw - v important and agree fully, something we need to consider
with linked data (in a WebID context especially!).

edits indented:

User privacy is an important feature of the Web. An increasing number
of APIs are being deployed which have a potential impact of user

   s/of/on

the OS) for use in a JavaScript context within a Web Application
executing in the the browser open up the user to potential privacy

   s/the the/the

only request the data needed. This approach minimizes the amount of
privacy-infringing data available to the application at any given time.

   general comment: probably worth taking some measure against
   dictionary attacks (as in what's to stop scripts just requesting
   everything one bit at a time)

This paper introduces the concept of API Minimization, traces the
roots of this approach and discusses some approaches to minimization
applicable to the design of Web application APIs. The intended

   s/Web application APIs/Web application APIs which deal with stored
data ?

2 Introducing API Minimization

In the Internet Draft "erminology for Talking about Privacy by Data

   s/"erminology/Terminology

Pfitzmann, Marit Hansen and Hannes Tschofenig succintly define

   s/succintly/succinctly

Furthermore, data minimization is the only generic strategy to enable
unlinkability, since all correct personal data provide some

   s/provide/provides

For instance, if a developer only needs to access a specific field of
a user address book, it should be possible to explicitly mark that
field in the API call so that the user agent can inform the user that
this single field of data will be shared.

   general comment: to me this implies you need global properties so
   that API developers know what to request (like rels or well defined
   predicates/properties)

   this could also apply to forms in HTML, the same properties could
   be used as form element "names", such that auto completion can be
   done properly / use the same API/data source.

The user agent can then act as a broker for trusted data, and will
only transmit data to the requester that the user has explicitly allowed.

   general comment: a few people are working on cloud storage / data
   wikis for personal data stores mounted on the web (separate the apps
   from the storage as per CloudStorage DI) thus what you're saying
   appears to be applicable to /any/ agent which handles the storage of
   (personal/private/sensitive) data - the work you're doing seems to
   apply to many contexts and could be good if it were targeted like
   that with special considerations for (a) DAP (b) web storage

This definition could be a generally applicable architecture principle
for development of browser-based APIs, especially those that provide
access to personal information.

   as above re scoping, "any agent" and "any api" not just
   browsers / Web IDL APIs ?

Within the context of the Web user agent (browser), what threats to
user privacy does API mimization protect us from?

   s/the Web user agent (browser)/web user agents (browsers)
   s/mimization/minimization


Web applications which are themselves not malicious but which are
never-the-less manipulated into divulging personal information, e.g.
through cross-site scripting attacks
In this context, since the Web application in question will only have
access to the minimum information it needs to perform its duties, it
will be impossible for it to pass on any additional information. For

   general comment: what's to stop the other script simply asking for
   the data itself?

example, a Web application whose intended purpose is to "check you in"
to a specific city, and uses the browser-based geolocation API to
retreive your location information, would only request and therefor

   s/retreive/retrieve
   a/therefor/therefore

would only receive your location at the city level of granularity. The
application would not have access to the more specific information
about your neighborhood or city block.

   s/neighborhood/neighbourhood

What does API minimization not protect us from?

Keyboard loggers, network sniffers, dumpster divers or anyone or
anything else out of the context of the Web user agent itself.

   general comment: it would be nicer to be able to say "anything
   running outside of the context/sandbox the application is authorized
   to run within." - there may be some scope to enable this approach
   via web widgets specs.


Privacy is an increasingly thorny problem on the Web. API
minimization, as a general architectural principle, does not solve the
problem of Web privacy. However, if this principle is applied
consistently during the design process as new APIs are designed and
implemented within the Web user agent, minimization can help to lower
the overall surface area of attack for actors looking to subvert user
privacy.

   last sentence needs a rewrite, something like:
   However, if this principle is applied consistently to the design
   and implementation of new APIs, then the overall surface area of
   attack for actors looking to subvert user privacy will be lowered.

Best,

Nathan
Received on Thursday, 3 February 2011 09:26:21 UTC