[Twisted-Python] Synchronization techniques
Daniel Miller
daniel at keystonewood.com
Thu Apr 5 07:53:56 MDT 2007
On Apr 5, 2007, at 5:53 AM, glyph at divmod.com wrote:
> >>I'm afraid that the feature you want doesn't make any sense and
> is, in a
> >>broad sense, impossible.
>
> >Maybe it's impossible for you to see things the way I see them
> because you
> >have become drunk on Twisted Kool-Aide.
>
> You are making it sound like you are bringing a fresh new idea to
> the discussion here, which I've never heard before and am unwilling
> to consider because I'm inflexible in my thinking. That's not
> what's happening.
I'm sorry I wrote that...it was inflammatory and did not bring any
value to the conversation. Please accept my apology.
> >In my specific case I am running
> >twisted in a single-threaded environment with a single synchronized
> >resource where each request that needs to access that resource
> must gain an
> >exclusive lock before doing anything with it (a classic locking
> scenario).
> >This is not "I'm being lazy and I do not want to learn how to use
> >Deferreds." Rather, it is a requirement that is dictated by the
> system with
> >which I am communicating (it does not support concurrent access
> through the
> >API provided by the vendor). Thus, my code would be much simpler
> (both to
> >write and maintain) if I had blockOn(), and it would not have any
> risk of
> >dead lock or other such concurrency bugs.
>
> You're confusing two things here.
>
> On the one hand, you want mutual exclusion for an external
> resource, which blocks.
>
> On the other, you want semantics for implementing that mutual
> exclusion via blocking in your own process.
>
The external "blocking" resource is just a shell script that takes
some time t run. It does not acquire any shared resources that would
result in dead lock and it will always return (maybe with an error,
but it will return) unless something terrible happens (e.g. plug is
pulled on server, fire, etc.).
>
> The former, as you have already demonstrated, can be implemented
> without the latter. The question is, would your code actually be
> simpler to write and to maintain if you had blockOn? Nothing
> you've said so far indicates that it would actually be more
> maintainable, and I've tried (although perhaps failed) to
> illustrate the high cost of the *apparent* simplicity at the moment
> of implementation.
It would be more maintainable because it would look just like normal
sequential python code:
lock.acquire() # uses blockOn() to acquire a DeferredLock
try:
process.check_call(['script1.sh']) # uses blockOn(spawnProcess
(...)) internally
process.check_call(['script2.sh'])
finally:
lock.release()
This is very simple and very easy to maintain. It could be written
with inlineCallbacks fairly easily as well:
yield lock.acquire()
try:
yield process.check_call(...)
yeild process.check_call(...)
finally:
lock.release()
That's pretty nice (so nice I might just rewrite my code that way).
My complaint is that the code must have knowledge of the twisted
environment (why else would it yield the result of process.check_call
()?). I do not really see the conceptual difference between these two
code blocks except one yields to and one calls into the reactor event
loop. Is there some other inherent problem with the first example? Of
course you need to make sure that the code inside the try/finally
block does not try to acquire the lock again, but that's a basic
concurrency problem which can even happen in the next example.
Moving on, in a fully deferred world we have this:
def deflock(func, *args, **kw):
def callback(lock, *args, **kw):
try:
result = func(*args, **kw)
except:
lock.release()
raise
if isinstance(result, Deferred):
def release(arg, lock):
lock.release()
return arg
result.addBoth(release, lock)
else:
lock.release()
return result
dfr = self.lock.acquire()
dfr.addCallback(callback, *args, **kw)
return dfr
def dostuff():
def deferproc(result, cmd):
return process.check_call(cmd) # returns a deferred
dfr = deferproc(None, ["script1.sh"])
dfr.addCallback(defproc, ["script2.sh"])
return dfr
dfr = deflock(dostuff)
... you get the picture.
Notice the code to acquire/release the lock--there are three
different calls to lock.release() in there, and they all must be
carefully sorted out to make sure that exactly one of them will be
called in any given scenario--that's hard to maintain.
>
> It strikes me that the process actually making the foreign API call
> could just block "for real" which would solve the mutual exclusion
> issue - callers into the PB layer would appear to be getting
> concurrent access, but responses would be implicitly queued up.
Right, that would work and that's exactly what subprocess.check_call
() (the real python built-in version) would do. Unfortunately twisted
does not work with the subprocess module--spawnProcess() is the only
alternative I found that actually works and that means I have to use
a deferred.
>
> Another solution here would be for Twisted to have a nice
> convenience API for dispatching tasks to a process pool. Right now
> setting up a process pool is conceptually easy but mechanically
> difficult; you have to do a lot of typing and make a lot of
> irrelevant decisions (AMP or PB or pickle? stdio or sockets?).
That sounds nice.
>
> >You might ask why I bother to
> >use Twisted? -- Perspective Broker is the most elegant way I
> could find to
> >call remote methods in Python. If it were abstracted from Twisted
> to become
> >a fully synchronous library I would use that instead, but at this
> point it
> >seems that if I want PB I am stuck with Twisted too.
>
> This is another area where the feature request doesn't quite make
> sense. It would be possible to implement something that looked
> kinda-sorta like PB, which dealt with a single request/response
> pair over a single socket, in an apparently synchronous and
> blocking manner. However, PB itself is a fully symmetrical
> protocol where the server can send messages to the client at any
> time, so a full PB implementation is not really possible when any
> message can be replied to with a "busy, poor implementation doesn't
> allow me to answer that message in this state" error.
I understand that PB is fully symmetrical. In my case I am only using
half (client makes request, server responds). Would it make sense to
relax the constraints when PB is used in this way?
>
> >Everything I've read about this issue suggests that the twisted
> developers
> >just don't want to give people what they want because it would
> allow them
> >to shoot themselves in the foot (for example, by using blockOn()
> in a
> >multi-threaded environment or in inappropriate places such as the
> example
> >above). But this is Python and we're consenting adults. With the
> proper
> >warnings a feature like this could make twisted much more
> palatable for
> >people with large existing projects that do not wish to rewrite
> entire
> >sections of code just to work with deferreds. It would allow
> people to get
> >the easiest thing working as quickly as possible, and then go
> back and
> >write the optimal deferred implementation later when
> >performance/blocking/etc. becomes an issue.
>
> I agree that it would be nice to allow programs to get on the
> Twisted bandwagon slowly, and to integrate more cleanly with
> foreign concurrency mechanisms like microthreads and database
> transactions. This is exactly what Jim Fulton is working on with
> the multi-reactor stuff for ZEO. You can't have one reentrant
> reactor, but you *can*, at least conceptually, have one reactor
> start another reactor and wait for it to complete a particular
> operation. If you'd like to help other projects gradually adapt to
> Twisted, perhaps you would like to contribute something to ticket
> #2545.
This looks very interesting. I'll try to help out with this effort if
I can find some time.
Thanks for taking time to read my ramblings and understand the
problems that I am having (even if we don't quite agree on the
simplest solutions). Your input is valuable, and I am indebted to you
for providing free support in your spare time.
~ Daniel
More information about the Twisted-Python
mailing list