[whatwg] Workers feedback from Alexey Proskuryakov on 2008-11-18 (public-whatwg-archive@w3.org from November 2008)

From: Alexey Proskuryakov <ap@webkit.org>
Date: Tue, 18 Nov 2008 20:12:01 +0300
Message-ID: <C548D211.61F07%ap@webkit.org>
on 18.11.2008 06:43, Ian Hickson at ian at hixie.ch wrote:

>> I'd be more that happy with a separate interface if the objects actually
>> behaved differently. One example of a good reason to have separate
>> interfaces was recently proposed here: shared workers should outlive
>> their creators. This is the sort of difference that would make having a
>> separate API reasonable, in my opinion.
> 
> You don't think that the way that a handle to a shared worker can be
> obtained dynamically without contact with the original creator is enough
> of a difference?

I think that this is a difference in a single function (namely, constructor)
behavior. One constructor can create named workers, and another can create
unnamed ("null-named") workers, which doesn't mean that they need to create
different kinds of objects.

But I've already said it before, so this is not new feedback.

> The complication here seems to be in the way you are implementing this.
> Port entangling should be atomic across threads -- when you are sending a
> port over another channel, you should block both threads, create the new
> object, 

Sorry if this looks like I'm just trying to be difficult, but you already
have a chance to deadlock here. If the blocked thread was inside malloc(),
then attempting to allocate memory in the main thread will freeze the
application.

This is very much an implementation concern, and in this particular case, it
is easily resolvable (you could allocate memory before locking).
Unfortunately, implementation bugs like this are notoriously hard to find
with testing, as they may be triggered by very specific usage scenarios. So,
even having a working implementation doesn't really mean that a spec written
in this manner is implementable, paradoxically.

> update the information,

I'm not sure what you mean here - certainly not hunting down all references
to the old entangled port that may be anywhere, or fixing all results of
calculations that involved its address? Yet, this is necessary if you are
blocking threads at arbitrary moments.

Again, an implementation concern, but the spec as it is talks about
algorithms, and not observable effects, and it is not clear to me what the
observable effects should be in cases where synchronous communication is
specced.

> shunt all pending messages over, and then
> resume the threads. If you implement the actual IPC using, say, a Unix
> socket, then you can just pass the actual socket along and do the same
> thing without blocking.

This is an interesting point. I do not know enough about how Unix domain
sockets are passed around, but since they the laws of nature are the same
for them, it's either that:
- my FUD is unbased, and it is in fact possible to implement the behavior;
- or semantics are very different for sockets. Some guesses are that queues
may be strictly limited in size, message delivery may not be guaranteed, or
that it is possible for client code to irrepairably deadlock processes with
them - something that JS developers obviously shouldn't be able to do.

I do not know which of the options is correct, but if the spec talked in
terms of message passing, it would have been more easily verifiable.

>>> For example, any method that entangles two ports blocks until both
>>> threads are synchronised and entangled.
>> 
>> This will cause deadlocks - if portB' is sent to the first thread as
>> portB'' in the above scheme, the lock will not let synchronization ever
>> finish.
> 
> Could you elaborate on this? I'm not sure I follow what you mean. If you
> mean that two ports in two threads are posted to each other's threads at
> the same time, 

Yes, this is what I'm talking about.

> then deadlock would only happen in a naive implementation
> that didn't keep pumping its message queue while waiting for a response
> from the other thread. Instead what you would want to do is to ask for a
> semaphore to communicate with the other thread, and if you don't get it,
> see if anyone wants to communicate with you, and if they do, let them do
> whatever they want, and then try again.

Designs like this are quite prone to all sorts of crazy problems, too. As a
simple example, the port waiting to be entangled may be sent further on, if
you let them "do whatever they want".

> I'm certainly open to changing the algorithms around if a better solution
> exists in a manner that gets the same behavior. I'm certainly no expert on
> the topic (as I'm sure the above responses have shown).

Since the spec is written in form of algorithms, and relies on a number of
arguable implicit assumptions on the implementation of their steps, it is
hard to process or verify the algorithms. In my opinion (I'm not claiming
expertise either!), a message passing design would be much clearer.

There are lots of discussions about designing multi-threaded algorithms on
the net, one I liked quite a bit recently is
<http://codemines.blogspot.com/2006/09/another-thread-on-threads.html> - it
presents the do's and don'ts very well.

- WBR, Alexey Proskuryakov.
Received on Tuesday, 18 November 2008 09:12:01 UTC