[whatwg] HTML5 Offline Web Applications

I think Michael has some valid concerns here.  Specifically, where he says:

- "Where does appCache deletion happen?"

and

- "I think the appCache update/validation logic is fundamentally flawed
   with regard to resources that are not explicitly listed."

Is anybody else working on the Offline Web Applications feature?

--Chris


On Fri, Aug 29, 2008 at 4:36 PM, Michael Nordman <michaeln at google.com> wrote:
> Hello again all,
>
> A couple more comments.
>
> When is anything ever deleted?
>
> Maybe i missed it, but where does appCache deletion happen?
>
> Something that Gears user's have done is to serve an empty manifest file.
> The results are a close approximation to having deleted the resource store.
> I would vote to have some syntax for expressing 'delete me'  in the manifest
> file for an appCache. A new type of event may be warranted for completion of
> such an update, and when swapCache() is called there would no longer an
> appCache associated with the context.
>
> Should we revisit the caching semantics for any resource not explicitly
> listed in the manifest?
>
> Unless i missed something, I think the appCache update/validation logic is
> fundamentally flawed with regard to resources that are not explicitly
> listed. As presently spec'd, a failure to update/validate any of these
> resources causes the entire update to fail, and the old version will
> remain pinned in the cache. Now suppose the app changes it's url space such
> that some of the resources that got picked up by one of the mechanisms to
> add new resources (autocaching namespace or manually .add()ed or <html
> manifest=x>) no longer make sense... i think this means the appCache is
> stuck in time.
>
> One idea is to rephrase this feature in terms closer to std http caching for
> all entries that do not explicily appear in the manifest file. In
> effect, closer to telling the http cache to not purge the resource.
>
> * at initial cache time
>   - cache the resource
>
> * at appCache update time
>   - validate all non-explicit entries per usual http caching semantics
>      (so 404s  will remove these entries at update time)
>   - network/server errors do not fail the larger update
>   - beyond that, not sure what todo on network/server errors... remove or
> retain the resources?
>   - perhaps maintain a list of 'failed to update' items that the webapp can
> access via script
>
> * at resource load time
>   - validate per usual http caching rules going forward
>     (so 404s will remove these entries)
>   - with the following exceptions
>      - use the cached resource as a fallback for network or server(5xx)
> errors
>      - do not purge the resource upon expiration
>
> Comments?
>
>
> On Mon, Aug 25, 2008 at 11:54 AM, Michael Nordman <michaeln at google.com>
> wrote:
>>
>> Hello all,
>>
>> I have many comments on the Offline Web Applications corner of the HTML5
>> spec. This is the first round of comments you'll see coming from me. This
>> one is mostly top-level comments.
>>
>> 5.7.2 Application caches
>>
>> I found the terminology used to describe the contents of the
>> cache sometimes contradictory and confusing, and it doesn't correspond
>> directly with the terminology used in the manifest file syntax. FWIW, some
>> word smithing and reconciling the differences could add clarity to the spec.
>>
>> cached resource categories
>>
>> * implicit category
>> This categorization applies to html docs which explicitly contain a
>> reference to the manifest file via the 'manifest' attribute of their <html>
>> tag. I understand they are not necessarily explicitly listed in the manifest
>> file, but they may also be explicitly listed. The end result is that a
>> resource can be categorized as both 'implicit' and 'explicit'. This is
>> confusing. I'd vote to have a different name for clarity sake... some
>> ideas... 'toplevel', 'manifest referencing', 'native' (an awkward play on
>> foreign).
>>
>> * manifest category
>> Perfect.
>>
>> * explicit category
>> Ok provided 'implicit' is renamed.
>>
>> * fallback category
>> The term 'fallback' refers to the prescribed use of these resources for
>> the opportunistic-caching namespace in particular. As part of pulling apart
>> namespaces vs how to handle hits within a namespace, I'd vote to change the
>> name for this category... some ideas... 'namespace-handler'.  I'll say more
>> more to say about different types of 'namespaces' below.
>>
>> * opportunistcally cached category
>> A mouthful, but ok. Another possibility is 'auto-cached' which would work
>> well with the 'manually-cached' terminology below.
>>
>> * dynamic category
>> I'd like to reserve the term 'dynamic' for a different use of that term
>> (more on that in a moment).  Some name possibilites for this category...
>> 'manually-cached' or 'script-added' or 'programatically-added'.
>>
>>  flavors of namespaces
>>
>> * online whitelist
>> As mentioned in previous messages, this would need to be some form of
>> namespacing or filtering to be useful. A better term for this might be
>> 'bypass' since with respect to the appcache, hits here bypass the cache. Its
>> not clear if path prefix matching is the best option for filtering out
>> request that should bypass the cache. In working with app developers using
>> Gears, the idea of specifying a particular query argument to filter on in
>> addition to a path prefix has come up. http://server/pathprefix   +
>> &bypassAppCache
>>
>> * opportunistic caching namespaces
>> A mouthful but ok. Whatever terminology used for the category of resulting
>> entries should be used here... perhaps 'auto-caching namespace'.
>>
>> * fallback namespace [factored out of opportunistic-caching]
>> This form of namespace is addressed by the spec at present, but is
>> co-mingled with the auto-caching feature. This is a proposal to detangle
>> them from one another. The basic idea is to load the resource as usual, and
>> only upon failure fallback to a cached 'namespace-handler'... no
>> auto-caching involved.
>>
>> * intercept namespaces [new]
>> This form of namespace is not in the spec at present. This is a proposal
>> to add it. It is a heavily used feature of the Gears LocalServer. The basic
>> idea is to intercept requests into this namespace and satisfy them with a
>> cached 'namespace-handler'  without consulting the server.
>>
>> summary of the above change requests
>>
>> Cached resource categories (just name changes):
>> * toplevel - pages which <html manifest='manifesturlforthisappcache'>
>> * manifest - the manifest file
>> * explicit - explicitly listed in the manifest file
>> * namespace-handler - resource which is utilized by a name-space
>> * auto-cached - resources that have been cached via the auto-cache
>> namespace
>> * manually-cached - resources that have been cached via a javascript call
>> to appCache.add()
>>
>> Namespaces (name changes, refactored things a bit, and introduced the
>> 'intercept' namespace)
>> * bypass - bypasses further lookup within the appcache and resorts to the
>> usual resource loading
>> * intercept - doesn't hit server, serves a cached namespace-handler
>> resource
>> * autocache - hits server, caches successful response for future use, on
>> server errors serves a cached namespace-handler resource
>> * fallback - hits server, does NOT cache successful responses, on server
>> errors serves a cached namespace-handler resource
>>
>> Manifest file section headers:
>> * BYPASS: list of url [namespaces/filters]
>> * CACHE: list of exact [urls]
>> * INTERCEPT: list of [urlnamespaces, namespace-handler url]
>> * AUTOCACHE: list of [urlnamespaces, namespace-handler url]
>> * FALLBACK: : list of [urlnamespaces, namespace-handler url]
>>
>> Scriptlets - or dynamic namespace-handlers [new idea]
>>
>> Something we wrestled with in the process of putting together the Gears
>> LocalServer was the distinction between intercepting requests for urls and
>> identifying the appropiate cached resource for that request. We ended up
>> with a declarative manifest file, similar to but different from what is
>> contained in this spec. This wasn't an altogether satisfying answer. The
>> expressiveness of the language to match/filter requested urls is limited in
>> Gears and this spec shares that same characterization.
>>
>> Something else we've wrestled with in Gears was having to do awkward
>> redesigns in corners of a web application in order to 'take it offline',
>> single-sign-on for example. In general, anywhere an application relies on
>> HTTP features more than HTML to influence navigation or conditional resource
>> loading, it's difficult to address with a static cache.
>>
>> So I'd like to propose extending this spec to incorporate 'dynamically
>> generated responses'. I think this capability fits into this corner of the
>> HTML5 spec because this is most directly useful in the "Offline Web
>> Application" scenario. The basic idea is to execute application code
>> (script) to produce responses to intercepted resource loads. The application
>> code is executed in the background and can formulate a response
>> asynchronously.
>>
>> Some handwaving where this could hang off of this spec
>> * Modify namespace-handlers entries to have an attitional attribute to
>> indicate that they are to be executed rather than returned
>>
>> And some handwaving at what a scriptlet can do...
>> * Can read the request headers and POST body
>> * Can set response status code and headers (redirects)
>> * Can generate a textual response body
>> * Can designate a non-executable cached resource to be returned in
>> response
>> * Can decide to 'bypass' handling of a request and defer to the usual
>> resource loading
>> * Can decide to perform the usual resource loading, but to have the
>> response added to the appCache
>> * Can access HTML5Database APIs
>> * Can utlize XmlHttpRequest to communicate with a server
>>
>> This would obviously be significant addition to the spec, but i do think
>> this is worth consideration in the context of 'offline applications'. Based
>> on observations of app developers wrestling with Gears, there have been
>> several pain points. The HTML5ApplicationCache addresses one of them
>> with per-application caches. This addition would address the second of
>> them.  (Another pain point has been application deployment).
>>
>> Am interested in seeing what others think of an addition along these
>> lines.
>
>

Received on Friday, 5 September 2008 13:46:46 UTC