[Twisted-Python] Synchronization techniques
Brian Granger
ellisonbg.net at gmail.com
Wed Apr 4 23:54:04 MDT 2007
>
> > >So anyway, I rewrote my server-side library to do it the twisted
> > way and
> > >return deferred's instead trying rig up some way of waiting for
> > them. I
> > >still think it would be super useful to be able to pseudo- block on a
> > >deferred (i.e. allow the reactor to process other events while
> > waiting for
> > >the deferred). It is very annoying to have to rewrite many layers
> > of code
> > >when twisted is introduced into a program. I did find
> > gthreadless.py, and
> > >maybe that would do it. Unfortunately discussion on that seems to
> > have been
> > >dropped some time ago...
> >
> > I'm afraid that the feature you want doesn't make any sense and is,
> > in a broad sense, impossible.
>
> Maybe it's impossible for you to see things the way I see them
> because you have become drunk on Twisted Kool-Aide. In my specific
> case I am running twisted in a single-threaded environment with a
> single synchronized resource where each request that needs to access
> that resource must gain an exclusive lock before doing anything with
> it (a classic locking scenario). This is not "I'm being lazy and I do
> not want to learn how to use Deferreds." Rather, it is a requirement
> that is dictated by the system with which I am communicating (it does
> not support concurrent access through the API provided by the
> vendor).
We have a very similar situation in IPython. We have a twisted server
that is managing access to a bunch of other processes (talking over
PB) that each don't support concurrent access.
> Thus, my code would be much simpler (both to write and
> maintain) if I had blockOn(), and it would not have any risk of dead
> lock or other such concurrency bugs.
I do disagree with this. In our case, we simply use a FIFO queue
based on Deferreds to manage multiple requests to a single resource
that does not support concurrent access. It is very simple and
explicit. Even if you had blockOn() you would still have to have
queue to manage the multiple requests, right? I don't at all see why
it would be simpler if blockOn existed.
> You're "Deferreds are hard" comment is an insult. You make it sound
> like I don't want to think. If I didn't want to think I wouldn't be
> be a software developer.
Just for the record: I think Deferreds _are_ hard, even damn hard -
at least if you want to do something non-trivial that has robust error
handling. Some of the callback/errback decision trees we have in our
code are insane and took days to get right and test fully. The point
is that doing these complex things would be even more insane without
twisted.
> Everything I've read about this issue suggests that the twisted
> developers just don't want to give people what they want because it
> would allow them to shoot themselves in the foot (for example, by
> using blockOn() in a multi-threaded environment or in inappropriate
> places such as the example above).
Personally, I would love a completely robust blockOn to exist. I
would use it in certain cases. But the bottom line is that many
people have tried to do this, but that have all failed. Their
collective wisdom (with which I agree) is that it can't be done
without completely redesigning twisted's internals - if at all -
without breaking the overall programming model in twisted. Most of us
are not ready to throw the baby out with the bathwater.
> Most people that would use blockOn() would probably use it in an
> entirely synchronous fashion where there would only be one deferred
> being processed at any given time. In these cases blockOn() would
> work just fine (if inefficiently). From your point of view that
> probably totally defeats the purpose of using twisted, but as I have
> pointed out above there are other useful features in twisted beside
> its deferred mechanism (PB).
I thought the same thing when I first wrote the version of blockOn
that we tried in IPython. As time went along though, I quickly
discovered that these assumptions are simply wrong. It doesn't work
just fine.
> The concept that I am thinking of seems entirely possible, although I
> am sure it would require rewriting existing reactor implementations.
> However, in the long run that seems like a small cost if twisted
> could be more widely adopted because it would play nicer with
> existing non-async code.
Currently my own gut feeling is that there is something intrinsic to
Twisted's asynchronous programming model that makes a construct like
blockOn impossible to implement (even if you re-wrote a reactor)
without introducing new types of deadlocks and indeterminant behavior
into the system. Thus it is not simply an issue of us not being smart
enough to figure out how to do it. It seems more fundamental than
that.
Actually, I think I see why (at least in part) it is problematic. If
blockOn exists, the following can happen:
def compute(a, b):
d = a.computeSomething()
# Lets say that b.state = 1 as of here
result = blockOn(d)
# Because the reactor just ran for an iondeterminant amount of
time, b.state could have
# changed - or maybe not.
# Thus the return value of this function is essentially a random result.
return b.state + result
To eliminate such indeterminacies, new constructs would need to be
created to handle such situations:
def compute(a, b):
d = a.computeSomething()
# Lets say that b.state = 1 as of here
acquire(b.state) # This gets a lock on b.state
result = blockOn(d)
result += b.state
# b.state =1 still
release(b.state) # release the lock
return result
But now you can get deadlocks as blockOn switches to another code
path. Things start to look just like threads at this point and the
Kool-Aide starts to taste bitter.
Brian
More information about the Twisted-Python
mailing list