[Twisted-Python] Synchronization techniques

Daniel Miller daniel at keystonewood.com
Thu Apr 5 07:53:56 MDT 2007


On Apr 5, 2007, at 5:53 AM, glyph at divmod.com wrote:

> >>I'm afraid that the feature you want doesn't make any sense and  
> is,  in a
> >>broad sense, impossible.
>
> >Maybe it's impossible for you to see things the way I see them   
> because you
> >have become drunk on Twisted Kool-Aide.
>
> You are making it sound like you are bringing a fresh new idea to  
> the discussion here, which I've never heard before and am unwilling  
> to consider because I'm inflexible in my thinking.  That's not  
> what's happening.

I'm sorry I wrote that...it was inflammatory and did not bring any  
value to the conversation. Please accept my apology.

> >In my specific  case I am running
> >twisted in a single-threaded environment with a  single synchronized
> >resource where each request that needs to access  that resource  
> must gain an
> >exclusive lock before doing anything with  it (a classic locking  
> scenario).
> >This is not "I'm being lazy and I do  not want to learn how to use
> >Deferreds." Rather, it is a requirement  that is dictated by the  
> system with
> >which I am communicating (it does  not support concurrent access  
> through the
> >API provided by the  vendor). Thus, my code would be much simpler  
> (both to
> >write and  maintain) if I had blockOn(), and it would not have any  
> risk of
> >dead  lock or other such concurrency bugs.
>
> You're confusing two things here.
>
> On the one hand, you want mutual exclusion for an external  
> resource, which blocks.
>
> On the other, you want semantics for implementing that mutual  
> exclusion via blocking in your own process.
>

The external "blocking" resource is just a shell script that takes  
some time t run. It does not acquire any shared resources that would  
result in dead lock and it will always return (maybe with an error,  
but it will return) unless something terrible happens (e.g. plug is  
pulled on server, fire, etc.).

>
> The former, as you have already demonstrated, can be implemented  
> without the latter.  The question is, would your code actually be  
> simpler to write and to maintain if you had blockOn?  Nothing  
> you've said so far indicates that it would actually be more  
> maintainable, and I've tried (although perhaps failed) to  
> illustrate the high cost of the *apparent* simplicity at the moment  
> of implementation.

It would be more maintainable because it would look just like normal  
sequential python code:

lock.acquire() # uses blockOn() to acquire a DeferredLock
try:
     process.check_call(['script1.sh']) # uses blockOn(spawnProcess 
(...)) internally
     process.check_call(['script2.sh'])
finally:
     lock.release()

This is very simple and very easy to maintain. It could be written  
with inlineCallbacks fairly easily as well:

yield lock.acquire()
try:
     yield process.check_call(...)
     yeild process.check_call(...)
finally:
     lock.release()

That's pretty nice (so nice I might just rewrite my code that way).  
My complaint is that the code must have knowledge of the twisted  
environment (why else would it yield the result of process.check_call 
()?). I do not really see the conceptual difference between these two  
code blocks except one yields to and one calls into the reactor event  
loop. Is there some other inherent problem with the first example? Of  
course you need to make sure that the code inside the try/finally  
block does not try to acquire the lock again, but that's a basic  
concurrency problem which can even happen in the next example.

Moving on, in a fully deferred world we have this:

def deflock(func, *args, **kw):
     def callback(lock, *args, **kw):
         try:
             result = func(*args, **kw)
         except:
             lock.release()
             raise
         if isinstance(result, Deferred):
             def release(arg, lock):
                 lock.release()
                 return arg
             result.addBoth(release, lock)
         else:
             lock.release()
         return result
     dfr = self.lock.acquire()
     dfr.addCallback(callback, *args, **kw)
     return dfr

def dostuff():
     def deferproc(result, cmd):
         return process.check_call(cmd) # returns a deferred
     dfr = deferproc(None, ["script1.sh"])
     dfr.addCallback(defproc, ["script2.sh"])
     return dfr

dfr = deflock(dostuff)

... you get the picture.

Notice the code to acquire/release the lock--there are three  
different calls to lock.release() in there, and they all must be  
carefully sorted out to make sure that exactly one of them will be  
called in any given scenario--that's hard to maintain.

>
> It strikes me that the process actually making the foreign API call  
> could just block "for real" which would solve the mutual exclusion  
> issue - callers into the PB layer would appear to be getting  
> concurrent access, but responses would be implicitly queued up.

Right, that would work and that's exactly what subprocess.check_call 
() (the real python built-in version) would do. Unfortunately twisted  
does not work with the subprocess module--spawnProcess() is the only  
alternative I found that actually works and that means I have to use  
a deferred.

>
> Another solution here would be for Twisted to have a nice  
> convenience API for dispatching tasks to a process pool.  Right now  
> setting up a process pool is conceptually easy but mechanically  
> difficult; you have to do a lot of typing and make a lot of  
> irrelevant decisions (AMP or PB or pickle?  stdio or sockets?).

That sounds nice.

>
> >You might ask why I bother to
> >use Twisted? -- Perspective Broker is the most elegant way I  
> could  find to
> >call remote methods in Python. If it were abstracted from  Twisted  
> to become
> >a fully synchronous library I would use that  instead, but at this  
> point it
> >seems that if I want PB I am stuck with Twisted too.
>
> This is another area where the feature request doesn't quite make  
> sense.  It would be possible to implement something that looked  
> kinda-sorta like PB, which dealt with a single request/response  
> pair over a single socket, in an apparently synchronous and  
> blocking manner.  However, PB itself is a fully symmetrical  
> protocol where the server can send messages to the client at any  
> time, so a full PB implementation is not really possible when any  
> message can be replied to with a "busy, poor implementation doesn't  
> allow me to answer that message in this state" error.

I understand that PB is fully symmetrical. In my case I am only using  
half (client makes request, server responds). Would it make sense to  
relax the constraints when PB is used in this way?

>
> >Everything I've read about this issue suggests that the twisted   
> developers
> >just don't want to give people what they want because it  would  
> allow them
> >to shoot themselves in the foot (for example, by  using blockOn()  
> in a
> >multi-threaded environment or in inappropriate  places such as the  
> example
> >above). But this is Python and we're  consenting adults. With the  
> proper
> >warnings a feature like this could  make twisted much more  
> palatable for
> >people with large existing  projects that do not wish to rewrite  
> entire
> >sections of code just to  work with deferreds. It would allow  
> people to get
> >the easiest thing  working as quickly as possible, and then go  
> back and
> >write the  optimal deferred implementation later when
> >performance/blocking/etc.  becomes an issue.
>
> I agree that it would be nice to allow programs to get on the  
> Twisted bandwagon slowly, and to integrate more cleanly with  
> foreign concurrency mechanisms like microthreads and database  
> transactions.  This is exactly what Jim Fulton is working on with  
> the multi-reactor stuff for ZEO.  You can't have one reentrant  
> reactor, but you *can*, at least conceptually, have one reactor  
> start another reactor and wait for it to complete a particular  
> operation.  If you'd like to help other projects gradually adapt to  
> Twisted, perhaps you would like to contribute something to ticket  
> #2545.

This looks very interesting. I'll try to help out with this effort if  
I can find some time.

Thanks for taking time to read my ramblings and understand the  
problems that I am having (even if we don't quite agree on the  
simplest solutions). Your input is valuable, and I am indebted to you  
for providing free support in your spare time.

~ Daniel






More information about the Twisted-Python mailing list