[whatwg] Persistent storage is critically flawed.

I've read the 2006-08-21 draft of Web Applications 1.0 carefully and I'm 
horrified that section 5.9 on persistent storage is being considered as 
a web standard - at least in its current form. My objections can be 
summarised as:

* Authors failure to handle the implications of "global" storage.
* Naive access controls which will result in guaranteed privacy violations.
* Lack of privilege separation.
* Messy API requiring callbacks to handle concurrency.

I'll deal with each in turn:


== 1: Authors failure to handle the implications of "global" storage. ==
First lets talk about the global store (|globalStorage['']) which is 
accessible from ALL domains.

Did anyone stop to really consider the implications of this? I mean, 
sure the standard implies that UA's should deal with the security 
implications of this themselves, but what if they don't? Let's say a UA 
does allow access to this global storage, what would we expect to find 
in this storage space? Does the author really believe that this will be 
only used for sharing preferences between domains for the benefit of the 
user? Hell no! It's going to look like this:

KEY                           VALUE
adsense3wd4ghgtut9jhn         
kjh234kj23u4y2j34234hkj234hkj23h4k234k234   <--  Advertiser user tracking
johnyizcool                   I Kickerz Azz!!!!!!                     
    <--  Attention freak
USconspiracy                  911 was an inside job. Tell 
everybody!      <--  Political activist
UScitID                       
kh546jkh45856456h45iu6y46j45j6h54kj6h45k6   <--  Government spying
GodsLove.com                  Warning! This user supports 
abortion.       <--  Vigilantie user tracking

|What possible use could this storage region ever have to a legitimate 
site? Especially when sensible UA's will just block it anyway? I for one 
do not want my browser becoming some sort of global 'grafitti wall' 
written on by every website I visit. Truthfully I cannot come up with a 
single legitimate use for the 'global' or 'com' regions that cannot be 
handled by per-domain storage or global storage with ACLs (see next point).


== 2: Naive access controls which will result in guaranteed privacy 
violations. ==
The standard advocates the two-way sharing of data between domains and 
subdomains - Namely that host.example.com should share data with the 
servers at 'www.host.example.com', 'example.com', and all servers rooted 
at '.com'. In its own words: "Each domain and each subdomain has its own 
separate storage area. Subdomains can access the storage areas of parent 
domains, and domains can access the storage areas of subdomains."

My objection to this is similar to my objection to the 'global' storage 
space - It's totally naive. The whole scheme is based on the unfounded 
belief that there is a guaranteed trust relationship available between 
the parties controlling each of these domains. Sure, one may be reliant 
on another for DNS redirection but that hardly implies that one wishes 
to share potentially confidential data with the other. As the author 
themselves stated there is no guarantee that users of geocities.com 
sub-domains wish their users data to be shared with GeoCities. The 
author states that geocities could mitigate this risk with a fake 
sub-domain but how does that help the owner of mysite.geocities.com? The 
author implies that UA's should deal with this themselves and fails to 
provide any REALISTIC guidelines for them to do so (sure lets hardcode 
all the TLD's and free hosting providers). What annoys me is that the 
author acknowledges the issue and then passes the buck to browser 
manufacturers as though it's their problem and they should solve it in 
any (incompatible or non-compliant) way they like.

But why bother? This whole problem is easily solved by allowing data to 
be stored with an access control list (ACL). For example the site 
developer should be able to specify that a data object be available to 
'*.example.com' and 'fred.geocities.com' only. How this is done (as a 
string or array) is irrelevant to this post but it must be done rather 
than relying on implicit trust where none exists.


== 3: Lack of privilege separation. ==
The proposal assumes that the shared data should be readable and 
writable by all sub and parent domains. I believe there is no reason why 
this shouldn't be extended to provide 'access control' similar to that 
implemented by standard file systems. For example if I want to publish 
an object called 'myKey' and make it accessable to other sites it does 
not automatically mean I want them to be able to modify or delete it. It 
is important that global storage allows read-only access to variables if 
it is to be widely adopted for information sharing between untrusting 
parties.


== 4: Messy API requiring callbacks to handle concurrency. ==
The author uses a complicated method of handling concurrency by using 
callbacks triggered by setItem() to interrupt processing in other open 
pages (ie, other tabs or frames) which could access the same data. Why 
can I not simply lock the item during updates or long reads and force 
other scripts to wait? While I'm unsure wether ECMAscript can handle 
proper database-style transactions it seems like it would be fairly easy 
for the developer to implement critical sections by using shared storage 
objects or metadata as mutexes and semaphores. I can't see what role the 
callback mechanism would fulfill that could not be handled more easily 
using traditional transactional logic.


== Conclusion ==
In conclusion it appears to me that the proposal is based on several 
fundamentally flawed security assumptions and is overly complex. I see 
this becoming a hiding place for viruses, malware and tracking cookies. 
Any sensible browser manufacturer would turn this feature off or limit 
its scope - thus rendering it inoperable for the many beneficial uses it 
would otherwise have. Those browsers that support this proposal are 
likely to do so in incompatible ways - due largely to the faults and 
omissions in this proposal that it implies UA's will solve. It seems 
like a large amount of browser sniffing will be required to have any 
assurance that persistent storage will work as advertised. Therefore, 
the global storage proposal must be fixed or removed.


Shannon
Web Developer

Received on Sunday, 27 August 2006 05:11:17 UTC