Re: [Web Workers API] Data synchronization from Jonas Sicking on 2009-01-20 (public-webapps@w3.org from January to March 2009)

From: Jonas Sicking <jonas@sicking.cc>
Date: Tue, 20 Jan 2009 10:36:34 -0800
To: "Nikunj Mehta" <nikunj.mehta@oracle.com>
Cc: public-webapps@w3.org
Message-ID: <63df84f0901201036mddebd4fk3f9d6b015b3f10d8@mail.gmail.com>
On Tue, Jan 20, 2009 at 9:57 AM, Nikunj Mehta <nikunj.mehta@oracle.com> wrote:
>
> On Jan 16, 2009, at 6:10 PM, Jonas Sicking wrote:
>
>> On Fri, Jan 16, 2009 at 5:17 PM, Nikunj Mehta <nikunj.mehta@oracle.com>
>> wrote:
>>>
>>> I have reviewed the draft specification dated 1/14 [1]. I am not sure
>>> about
>>> the status of this spec vis-a-vis this WG. Still, and without having
>>> reviewed any mailing list archives about prior discussion on this draft,
>>> here are some questions around the scope of this spec:
>>>
>>> 1. Are background workers executing outside the current browsing context
>>> completely out of consideration? As an implementor of sync engines and
>>> developer of applications that use them, Oracle's experience shows that
>>> trickle sync is the most usable approach and that in trickle sync an
>>> application doesn't need to be active for data to be moved back and
>>> forth.
>>
>> All workers execute outside the context of a current browsing context.
>> However the lifetime of a dedicated worker is tied to the lifetime of
>> a browsing context. However shared workers can persist across
>> contexts.
>>
>> Extending the lifetime too far beyond the lifetime of a browsing
>> context has usability, and possibly security, issues though.
>
> Let's be specific here. What are the kinds of threats introduced that are do
> not already exist with these background workers? If a foreground script
> takes too long, browsers pop up a dialog asking if the user wants to
> terminate the script. Why can the same not be said about workers whose
> lifetime is not tied to any browsing context, per se?

Any time we need to open a dialog, it is a problem if we can't also
switch to the page that that dialog is related to. For example when
fetching a network resource, we may need to open a dialog asking for
password. When that happens we switch to show the window and tab of
the page making the request. This is important to prevent users from
thinking that they are entering the password for a bank page, while
they are in fact sending it to a background tab.

While we do also show the url that the password will be sent to, it
has been shown that users often don't read the details of a dialog
box, and that showing the appropriate tab is more effective.

>> As a
>> browser developer I'm not really comfortable with allowing a site to
>> use up too much resources after a user has navigated away from a site.
>
> How does the current design of workers protect available resources against
> malfeasance or unfair use. In fact, if anything, using the current
> WebWorkers draft, naïve design can easily rob users of the ability to
> control network usage and remove the ability of a user to terminate a worker
> when so required, if an application does not provide suitable means for
> doing so.

In Firefox only in the sense that if a user notices that a lot of
resources, the user can attempt to close the page that he/she suspects
is using a lot of resources.

>>> 2. Long running scripts pose a problem especially when script containers
>>> leak memory over time.  Is it giving too much freedom to workers to run
>>> as
>>> long as they wish and use as many network/memory resources as they wish?
>>
>> By "script containers", do you mean script engines?
>>
>
> Correct
>
>> If so, long running scripts are no different from scripts that run
>> short but often as is the alternative in browsing contexts. We can run
>> garbage collection in the middle of a running script.
>
> Then why does my browser's memory usage keep increasing when I keep pages
> open for a number of days, especially if those pages have a fair amount of
> JavaScript? Or may be Firefox has resolved such issues in recent releases.

Because of bugs in the browser, and bugs in the page. However you'll
run into the exact same bugs from running one script long (which is
what you can do on a worker) or having to return from the script and
resume on a timer (which is what people do in the main browsing
context).

>>> 3. On devices which do not like background processes making continuous
>>> use
>>> of CPU/network resources (such as iPhone and BlackBerry). how can one
>>> take
>>> advantage of native notification services to provide up-to-date
>>> information
>>> at a low enough resource cost?
>>
>> This is actually a pretty interesting question.
>
> I see a fundamental shortcoming in the WebWorkers spec because it seems to
> wish away some of the problems of efficient synchronization simply by
> providing a background execution model. While having multiple distinct "use
> cases" for WebWorkers seems like a good thing, IMHO, the current spec will
> not support industrial strength synchronization for Web applications on
> mobile devices, which should be an explicit goal of this spec.

As I said before, this is a function of the APIs we expose to the
workers, not a function of the worker API itself. Nothing prevents
anyone from coming up with a synchronization API and exposing it to
main browsing contexts or workers. And nothing that I can see that
we've done has made it harder for anyone to do so. If you do see
anything we are doing that would make it harder to in parallel develop
a synchronization API, please do let us know.


>>> 4. Why is the spec biased towards those implementors who would like to
>>> persist synchronization results and application data in the
>>> structured/local
>>> storage only? Why not consider needs of those who would prefer to keep
>>> their
>>> data accessible directly through HTTP/S, even in the disconnected case?
>>
>> Because such APIs already exist. As soon as there is a spec for
>> synchronization with a server I see no reason not to expose that to
>> workers. Indeed, the people coming up with a server sync API could
>> define that the API is available to workers if they so desire.
>
> I am afraid you may have misunderstood me here. My point is that it is being
> assumed that applications that wish to hoard their data for offline use want
> to do so only through the localStorage/database mechanisms being introduced
> in HTML5. I have been a proponent of architecture wherein applications to
> use the same API for accessing data regardless of whether I am connected or
> not. This allows gradual migration of users and applications over time.
>
> For this architecture to work, Worker objects need access to the application
> cache, and the application cache needs to offer the ability to add, remove,
> or change the bits of the representation of a resource.

Is there anything done in the current worker spec to make this harder?
I feel like the current spec is completely orthogonal to the
synchronization architecture you are talking about. The same way that
the CSS or XMLHttpRequest specs are.

/ Jonas
Received on Tuesday, 20 January 2009 18:37:11 UTC