Re: [Web Workers API] Data synchronization

On Jan 16, 2009, at 6:10 PM, Jonas Sicking wrote:

> On Fri, Jan 16, 2009 at 5:17 PM, Nikunj Mehta  
> <nikunj.mehta@oracle.com> wrote:
>>
>> I have reviewed the draft specification dated 1/14 [1]. I am not  
>> sure about
>> the status of this spec vis-a-vis this WG. Still, and without having
>> reviewed any mailing list archives about prior discussion on this  
>> draft,
>> here are some questions around the scope of this spec:
>>
>> 1. Are background workers executing outside the current browsing  
>> context
>> completely out of consideration? As an implementor of sync engines  
>> and
>> developer of applications that use them, Oracle's experience shows  
>> that
>> trickle sync is the most usable approach and that in trickle sync an
>> application doesn't need to be active for data to be moved back and  
>> forth.
>
> All workers execute outside the context of a current browsing context.
> However the lifetime of a dedicated worker is tied to the lifetime of
> a browsing context. However shared workers can persist across
> contexts.
>
> Extending the lifetime too far beyond the lifetime of a browsing
> context has usability, and possibly security, issues though.

Let's be specific here. What are the kinds of threats introduced that  
are do not already exist with these background workers? If a  
foreground script takes too long, browsers pop up a dialog asking if  
the user wants to terminate the script. Why can the same not be said  
about workers whose lifetime is not tied to any browsing context, per  
se?

> As a
> browser developer I'm not really comfortable with allowing a site to
> use up too much resources after a user has navigated away from a site.

How does the current design of workers protect available resources  
against malfeasance or unfair use. In fact, if anything, using the  
current WebWorkers draft, naïve design can easily rob users of the  
ability to control network usage and remove the ability of a user to  
terminate a worker when so required, if an application does not  
provide suitable means for doing so.

>
>
>> 2. Long running scripts pose a problem especially when script  
>> containers
>> leak memory over time.  Is it giving too much freedom to workers to  
>> run as
>> long as they wish and use as many network/memory resources as they  
>> wish?
>
> By "script containers", do you mean script engines?
>

Correct

> If so, long running scripts are no different from scripts that run
> short but often as is the alternative in browsing contexts. We can run
> garbage collection in the middle of a running script.

Then why does my browser's memory usage keep increasing when I keep  
pages open for a number of days, especially if those pages have a fair  
amount of JavaScript? Or may be Firefox has resolved such issues in  
recent releases.

>> 3. On devices which do not like background processes making  
>> continuous use
>> of CPU/network resources (such as iPhone and BlackBerry). how can  
>> one take
>> advantage of native notification services to provide up-to-date  
>> information
>> at a low enough resource cost?
>
> This is actually a pretty interesting question.

I see a fundamental shortcoming in the WebWorkers spec because it  
seems to wish away some of the problems of efficient synchronization  
simply by providing a background execution model. While having  
multiple distinct "use cases" for WebWorkers seems like a good thing,  
IMHO, the current spec will not support industrial strength  
synchronization for Web applications on mobile devices, which should  
be an explicit goal of this spec.

>
>
> It's really more a property of which APIs we expose to workers, rather
> than the worker API itself I'd say. We need someone to define an API
> that allows native notification services to be the transport layer,
> and then we can expose that API to workers.
>
> What's interesting is that the HTML5 spec actually makes an attempt at
> defining such an API. The problem is that it uses an <eventsource>
> element to do it, which means that we can't use the API directly
> inside workers.
>
> Hixie: Should we consider making <eventsource> a pure JS API so that
> it can be reused for workers?
>
>> 4. Why is the spec biased towards those implementors who would like  
>> to
>> persist synchronization results and application data in the  
>> structured/local
>> storage only? Why not consider needs of those who would prefer to  
>> keep their
>> data accessible directly through HTTP/S, even in the disconnected  
>> case?
>
> Because such APIs already exist. As soon as there is a spec for
> synchronization with a server I see no reason not to expose that to
> workers. Indeed, the people coming up with a server sync API could
> define that the API is available to workers if they so desire.

I am afraid you may have misunderstood me here. My point is that it is  
being assumed that applications that wish to hoard their data for  
offline use want to do so only through the localStorage/database  
mechanisms being introduced in HTML5. I have been a proponent of  
architecture wherein applications to use the same API for accessing  
data regardless of whether I am connected or not. This allows gradual  
migration of users and applications over time.

For this architecture to work, Worker objects need access to the  
application cache, and the application cache needs to offer the  
ability to add, remove, or change the bits of the representation of a  
resource.

> Note that the WebSockets API is exposed to workers, which can be used
> to implement server synchronization.

That merely happens to be a backchannel for communication from the  
server and is unrelated to my question above.

Received on Tuesday, 20 January 2009 17:59:11 UTC