- From: Nikunj R. Mehta <nikunj.mehta@oracle.com>
- Date: Mon, 20 Jul 2009 09:22:30 -0700
- To: Mark Nottingham <mnot@yahoo-inc.com>
- Cc: public-webapps WG <public-webapps@w3.org>
Hi Mark, I am happy to see your feedback on DataCache. Forgive me for the delay in responding. On Jul 17, 2009, at 4:50 PM, Mark Nottingham wrote: > I think this work is in an interesting space but, unfortunately, > it's doing it without reference to the existing HTTP caching model, > resulting in a lot of duplicated work, potential conflicts and > ambiguities, as well as opportunity cost. I don't understand this fully, can you please explain? From what I know, the Gears implementation can be easily extended to support DataCache. Of course, one doesn't need all of Gears - only LocalServer and browser integration is required. I don't see that as a lot of duplicated work. > > Furthermore, it's specifying an API as the primary method of > controlling caches. While that's understandable if you look at the > world as a collection of APIs, it's also quite limiting; it > precludes reuse of information, unintended uses, and caching by > anything except the browser. FWIW, DataCache is not the first attempt at obtaining an API to control a browser's HTTP cache. That was already the case with ApplicationCache in HTML5. I don't quite understand what problems you foresee with DataCache's approach. It does not ask the implementor to violate any HTTP caching semantics. If anything, it suggests that the implementation can offer an off-line response should an on-line response be infeasible. This is based on my reading of the following pieces of text from RFC2616. From §13, [[ Requirements for performance, availability, and disconnected operation require us to be able to relax the goal of semantic transparency. ... Protocol features that allow a cache to attach warnings to responses that do not preserve the requested approximation of semantic transparency. ]] From §13.1.6 [[ A client MAY also specify that it will accept stale responses, up to some maximum amount of staleness. This loosens the constraints on the caches, and so might violate the origin server's specified constraints on semantic transparency, but might be necessary to support disconnected operation, or high availability in the face of poor connectivity. ]] Can you please correct me if I have misinterpreted or misapplied these provisions of HTTP? Alternatively, can you point me to a valid interpretation of these portions in the context of an open implementation/application?c > > A much better solution would be to declaratively define what URIs > delineate an application (e.g., in response headers and/or a > separate representation), and then allow clients to request an > entire application to be fetched before they go offline (for > example). I'm aware that there are other use cases and capabilities > here, this is just one example. Am I correct in understanding that you find pre-fetching the entire application to be better than pre-fetching parts of it. In any case, are you also suggesting a data format for specifying a collection of such URIs that the user agent should pin down in cache? How does a data format form a better solution as opposed to an API? Additionally, it is not always possible to statically define the collection of URIs that are of interest to an application. Let me take an example - *Sales force automation* My sales reps work in parts of the world where assuming a reliable network connection is not a good assumption to make. Still I would like to deploy order entry applications that work reliably in the face of poor network connection on a small mobile computer with a Web browser. Today I am going on a round of my customers in Fallujah and I need to have information about customers in that area, including their names, addresses, and order history (and status). This information changes regularly and my sales reps benefit from up-to-the-minute order history information if I can connect to the server at the time I am at the customer's office. If I don't have network access, I at least have up-to-the-date information. Finally, I want to enable the sales rep to take orders when they are out in the field and provided they don't lose the device, I want to assure them that their orders will make it to the company's servers. If connectivity is available at that instant, then the order will be confirmed immediately and processing would begin. If not, it would be kept pending. Developers until now have developed and deployed such off-line applications outside the context of the Web architecture - i.e., no URIs, no uniform methods, etc. They will continue to do the same with SQL databases inside Web browsers - still no URIs, a single method - POST - and an off-line only solution (meaning it cannot take opportunistic advantage of available networks). Is this a more desirable approach than to provide an API to a subset of the browser's HTTP cache? > > Doubtless there's still a need for some new APIs here, but I think > they should be minimal (e.g., about querying the state of the cache, > in terms of offline/online, etc.), not re-defining the cache itself. Can you elaborate a little more? What do you mean by re-defining the cache? Can you provide specific reasons why the DataCache API seems like redefining the cache? > > FWIW, I'd be very interested in helping develop protocols and APIs > along what's outlined above. Sorry, but I didn't see any outline. May be I missed something and would appreciate if you can specifically provide an outline. In any case, I welcome you to offer your counsel on better addressing the requirements of DataCache that I have previously stated [1]. It would be best if these requirements can be addressed through the correct use of HTTP as opposed to API magic. > > Cheers, > > P.S. This draft alludes to automatic prefetching without user > intervention. However, there is a long history of experimentation > with pre-fetching on the Web, and the general consensus is that it's > of doubtful utility at best, and dangerous at worst (particularly in > bandwidth-limited deployments, where bandwidth is charged for, as > well as when servers are taken down because of storms of prefetch > requests). There is now also a fairly large amount of experience with prefetching outside of the regular HTTP ambit. Siebel CRM (one of the most popular enterprise non-productivity off-line applications) as well as MySpace and GMail both pre-fetch thousands if not more pieces of data and store them locally. Have you considered this experience as relevant? I may not be wrong in saying that the general observation you are making is not the relevant in DataCache's case. While in the general case, pre-fetching is not a good idea, but why kill the messenger? Let programmers make the right choice for their applications and learn from their own experience. IMHO, not doing DataCache like things turns people away from using (and I mean not abusing) the Web for more brittle and less widely deployable as well as far more laboriously crafted architectures. [1] http://lists.w3.org/Archives/Public/public-webapps/2008OctDec/0104.html
Received on Monday, 20 July 2009 16:24:49 UTC