[Twisted-Python] Synchronization techniques

Thu Apr 5 12:09:53 MDT 2007

> >The external "blocking" resource is just a shell script that  
> takes  some
> >time t run. It does not acquire any shared resources that would   
> result in
> >dead lock and it will always return (maybe with an error,  but it  
> will
> >return) unless something terrible happens (e.g. plug is  pulled on  
> server,
> >fire, etc.).
>
> I thought I understood what was going on, but now I'm confused  
> again.  Why do you need mutual exclusion at all if it doesn't  
> acquire any shared resources?  Couldn't you just run it concurrently?

I guess I said that wrong. When I said "it does not acquire any  
shared resources" I was referring to the external system being  
manipulated by the shell script. That effectively means that the  
shell script is the shared resource and it can only be called in a  
synchronous manner. The script is essentially posting a transaction,  
which must be done in an atomic fashion with regard to my code. I  
know this is very ugly, and I'd love to fix it. Unfortunately it's  
not my system so I can't.

We can keep going around and around about this but there's no need.  
My immediate problem was solved when I learned that I could return a  
deferred from a PB remote_xxx() method.

What I would like to continue to discuss is whether all code that  
calls something that does deferred logic must be immediately aware of  
that fact.

> >It would be more maintainable because it would look just like normal
> >sequential python code:
>
> Yes, it would *look* like sequential python code.  But it wouldn't  
> be :).  There's a heck of a lot that can happen in acquire(); your  
> whole application could run for ten minutes on that one line of  
> code.  Worst of all, it would only happen in extreme situations, so  
> testing or debugging issues that are caused by it becomes even more  
> difficult.

This could happen with any deferred logic. As long as the code has  
the proper concurrency logic this is not a problem--even if it takes  
10 minutes. In today's operating systems something like that could  
even happen in plain old synchronous single-threaded code if the OS  
decided to give some other process priority for that long (unlikely  
but possible).

>
> >My complaint is that the code must have knowledge of the twisted   
> environment
> >(why else would it yield the result of process.check_call ()?). I  
> do not
> >really see the conceptual difference between these two  code  
> blocks except
> >one yields to and one calls into the reactor event  loop. Is there  
> some
> >other inherent problem with the first example? Of  course you need  
> to make
> >sure that the code inside the try/finally  block does not try to  
> acquire the
> >lock again, but that's a basic  concurrency problem which can even  
> happen in
> >the next example.
>
> This is really the key thing.  If you're running your code in the  
> Twisted environment, and you want it to be correct, it really must  
> know about the Twisted environment.  The simple presence of the  
> 'yield' keyword at every level where a Deferred is being returned  
> forces you to acknowledge, "yes, I know that a context switch may  
> occur here".  Without it, any function could suddenly and radically  
> change the assumptions that all of its callers were allowed to make.

So it's really a matter of being explicit...and it's true that  
"explicit is better than implicit" but then again, "practicality  
beats purity" :-) It would be super nice to be able to provide the  
exact interface of a normal python module/class/function and have  
twisted logic going on inside. When used properly it would be very  
powerful. Of course doing something like this is definitely not  
entirely innocent, and there should be warnings provided with  
implementations that may block (as there should be with any other  
piece of concurrency-related code that may block). But it's not nice  
to force everyone to use an awkward interface just to try to help  
them avoid mistakes.

~ Daniel