Detailed review of 4.11. Client-side persistent storage

Hello!

I have reviewed section 4.11. "Client-side session and persistent storage  
of name/value pairs" [1]. Here are my comments:



1. In section 4.11.3. "The StorageItem interface" [2], I would suggest:


a) the StorageItem objects should also have two read-only attributes:  
dateCreated and dateModified, as Date objects (or UNIX timestamps).

One of the uses is some web applications might disconsider/purge values  
which are too old. Currently, one would need to store two separate items  
for this kind of tracking. I'm not asking for an dateExpires, like cookies  
have.

Also, I'm thinking UAs which will implement persistent storage will  
obviously internally save the dateCreate and dateModified values - they'll  
use these two to automatically purge items which are too old (such that  
the UA doesn't slow down too much, performance issues, and privacy  
issues). Basically, I only want these two values exposed to the web  
applications as well.

It doesn't really make sense to leave this out of the spec. There are tons  
of cases where timestamps are used: files and folders on filesystems have  
the created and/or modified date as metadata, databases, tables in  
relational databases (like mySQL) have created and modified date as  
metadata, emails, etc.


b) the StorageItem object could also have an attribute defining lastURL:  
the absolute URL of the last page (without any query parameters) which  
modified the value of the object.

This is just an idea - I don't consider this a requirement (as the above  
one). It would be a nice feature.

But then ... both of the suggestions above enable even more tracking -  
privacy concerns. Maybe enable these attributes only for secure pages?


c) Also a question: the storage event is defined just as a notification  
which tells the potential listeners that the storage for the domain has  
been modified. Why wasn't the storage event defined as a notification  
which tells exactly what changed? As in, include the StorageItem object  
itself as well. Would that be a security/privacy concern? It shouldn't be:  
the scripts can access the StorageItem, anyway.

Currently, say two web applications would need to share *several*  
StorageItem objects. If application A changes something of interest for  
application B, then the listener within page B would have to search  
through the list of StorageItem objects of the domain where application A  
resides. Only the domain is known, given the "domain" attribute defined  
within the storage event. Also, checking what was changed is even harder  
given there's no dateModified attribute defined for StorageItem objects.  
If performance is an issue for both applications, they would have to use  
cross-site messaging to notify each other about the specific changes. That  
shouldn't be needed for simple storage updates - only for complex  
communication between two (or more) applications. Cross-site messaging  
would also add a lot more complexity, because the involved application  
must have their "communication protocol" defined.

For now, I'm not sure how useful the storage event is. After all, if two  
applications need to verify changes in some Storage object, they'll have  
to use a defined communication protocol (with cross-site messaging), even  
for very simple stuff.




2. In section 4.11.5. "The globalStorage attribute" [3], the definition of  
the namedItem() method [4] has a typo:

"The namedItem(domain) method tries to *returns* a Storage object  
associated with the given domain, according to the rules that follow."

Correction: return.



3. In section 4.11.7.1. "Disk space" [5]:

"If the storage area space limit is reached during a setItem() call, the  
user agent should raise an exception."

This is too ambiguous. This can cause inconsistencies between  
implementations.

I'd recommend defining that as a MUST, including which specific exception  
to be raised.

How are scripts supposed to work when the "disk quota is full"? That  
should be defined in the spec.

An idea would be to have a new boolean attribute for the Storage object:  
isWritable. This would false when "disk quota is full", or true otherwise.


4. In section 4.11.8.1. "User tracking" [6], source code HTML comment:

"<!-- XXX should there be an explicit way for sites to state when
     data should expire? as in
     globalStorage['example.com'].expireData(365); ? -->"

I did think of this feature, while reading through the spec. I don't think  
this is a high priority feature. It would be nice, but define it such that  
only scripts running from example.com can use the expireData() method on  
the Storage object. If scripts on any other domain (sub.example.com or  
"com") try to call globalStorage['example.com'].expireData(), raise a  
security exception.


5. In section 4.11.8.4. "Cross-protocol and cross-port attacks" [7]:

"Big Issue: What about if someone is able to get a server up on a port,  
and can then send people to that URI? They could steal all the data with  
no further interaction. How about putting the port number at the end of  
the string being compared? (Implicitly.)"

I strongly recommend putting the port number at the end of the string  
being compared. My recommandation is not based only on security-related  
concerns, but also practical concerns.

It's very wrong to assume the same application runs on a different port,  
on the same domain. It's obviously a different web application.

Web developers (including me) commonly host multiple web  
sites/applications on the same server, on varying port numbers. It would  
be very confusing and annoying to have the same persistent storage across  
different ports.

The current definition of the persistent storage is completely eliminating  
the use of port numbers - which is very wrong.



6. Personally I find the overall storage idea very good. However, I also  
find it far too "liberal" - regarding security.

Here's what I suggest, something maybe simple, yet, this is something I  
would personally use, in many cases:

Define a third argument for the setItem() method of the Storage object.  
Name it "private", of boolean type. If the author sets this optional  
argument to true, then the StorageItem object is flagged as private.

StorageItem objects flagged as private will *only* be available to scripts  
on the *same* domain (same origin), not on any subdomains, not on  
higher-level domains. For example: if a script on "music.example.com"  
creates a private StorageItem object named "myTest", other scripts on the  
same domain will be able to read it. Yet, scripts which run on  
"beta.music.example.com" or "example.com", will cause raising a security  
exception if they try to read/write the "myTest" StorageItem object.

Obviously, scripts on domain "music.example.com" cannot create private  
StorageItem objects for other domains (say for "example.com"). The flag  
can only be set in Storage objects belonging to the same domain.

With the addition of this flag, the StorageItem objects also need a new  
boolean attribute: private. Scripts on the same domain can change the  
value of the attribute.

I'd even say make the private flag the default (set to true). Too many  
will not bother to make secured sites (too many won't bother understanding  
how storage works, they'll just copy/paste code).

Currently, I would not "feel comfortable" using persistent storage -  
knowing other domains can read my items. As I said, it's too risky. It's  
not risky only if you are the sole owner of the entire domain *and* server.



If suggestions 5 (port numbers) and 6 (private flag) are added into the  
spec, and implemented into UAs, I would appreciate the whole storage  
feature much more. For now, I have doubts with regards to how really  
*usable* it is.



That's about all, for now. Thank you for reading.



[1]  
http://www.whatwg.org/specs/web-apps/current-work/multipage/section-storage.html#storage
[2]  
http://www.whatwg.org/specs/web-apps/current-work/multipage/section-storage.html#the-storageitem
[3]  
http://www.whatwg.org/specs/web-apps/current-work/multipage/section-storage.html#the-globalstorage
[4]  
http://www.whatwg.org/specs/web-apps/current-work/multipage/section-storage.html#nameditem2
[5]  
http://www.whatwg.org/specs/web-apps/current-work/multipage/section-storage.html#disk-space
[6]  
http://www.whatwg.org/specs/web-apps/current-work/multipage/section-storage.html#user-tracking
[7]  
http://www.whatwg.org/specs/web-apps/current-work/multipage/section-storage.html#cross-protocol



-- 
http://www.robodesign.ro

Received on Monday, 17 September 2007 12:05:48 UTC