[Twisted-Python] Synchronization techniques

Thu Apr 5 03:53:37 MDT 2007

On 04:25 am, daniel at keystonewood.com wrote:
>On Apr 4, 2007, at 1:43 PM, glyph at divmod.com wrote:
>>On 04:35 pm, daniel at keystonewood.com wrote:
>> >On Apr 3, 2007, at 8:42 PM, Itamar Shtull-Trauring wrote:
>> >>On Tue, 2007-04-03 at 17:07 -0400, Daniel Miller wrote:

>Well of course it's no big deal to change IQuoter, but that specific 
>case wasn't really my point. My point is that in the real world it's  a 
>BAD THING to have to rewrite perfectly good/working/tested code  just 
>because we want to use twisted. But this is exactly what  happened to 
>me when twisted was introduced into my project.

Hmm.  Well, I don't know about the "real world" - haven't visited in a 
while - but in the magical fairy kingdom where *I* live, it is generally 
considered a good idea to globally consider the implications of a new 
programming model on "good/working/tested code".  Networks and 
concurrency, in particular, have this pesky habit of introducing 
entirely new error conditions into previously "working" code, breaking 
all of its assumptions.  This isn't specific to Twisted, but Twisted 
does deal with networks and concurrency quite a bit.
>>I'm afraid that the feature you want doesn't make any sense and is, 
>>in a broad sense, impossible.

>Maybe it's impossible for you to see things the way I see them  because 
>you have become drunk on Twisted Kool-Aide.

You are making it sound like you are bringing a fresh new idea to the 
discussion here, which I've never heard before and am unwilling to 
consider because I'm inflexible in my thinking.  That's not what's 
happening.  This is a frequently-debunked and well-understood issue in 
the Twisted community.  It seems to come up about once per year.  See my 
now apparently prescient blog post as of last April:

    http://glyf.livejournal.com/40037.html
>In my specific  case I am running twisted in a single-threaded 
>environment with a  single synchronized resource where each request 
>that needs to access  that resource must gain an exclusive lock before 
>doing anything with  it (a classic locking scenario). This is not "I'm 
>being lazy and I do  not want to learn how to use Deferreds." Rather, 
>it is a requirement  that is dictated by the system with which I am 
>communicating (it does  not support concurrent access through the API 
>provided by the  vendor). Thus, my code would be much simpler (both to 
>write and  maintain) if I had blockOn(), and it would not have any risk 
>of dead  lock or other such concurrency bugs.

You're confusing two things here.

On the one hand, you want mutual exclusion for an external resource, 
which blocks.

On the other, you want semantics for implementing that mutual exclusion 
via blocking in your own process.

The former, as you have already demonstrated, can be implemented without 
the latter.  The question is, would your code actually be simpler to 
write and to maintain if you had blockOn?  Nothing you've said so far 
indicates that it would actually be more maintainable, and I've tried 
(although perhaps failed) to illustrate the high cost of the *apparent* 
simplicity at the moment of implementation.

It strikes me that the process actually making the foreign API call 
could just block "for real" which would solve the mutual exclusion issue 
- callers into the PB layer would appear to be getting concurrent 
access, but responses would be implicitly queued up.

Another solution here would be for Twisted to have a nice convenience 
API for dispatching tasks to a process pool.  Right now setting up a 
process pool is conceptually easy but mechanically difficult; you have 
to do a lot of typing and make a lot of irrelevant decisions (AMP or PB 
or pickle?  stdio or sockets?).
>You might ask why I bother to use Twisted? -- Perspective Broker is the 
>most elegant way I could  find to call remote methods in Python. If it 
>were abstracted from  Twisted to become a fully synchronous library I 
>would use that  instead, but at this point it seems that if I want PB I 
>am stuck with Twisted too.

This is another area where the feature request doesn't quite make sense. 
It would be possible to implement something that looked kinda-sorta like 
PB, which dealt with a single request/response pair over a single 
socket, in an apparently synchronous and blocking manner.  However, PB 
itself is a fully symmetrical protocol where the server can send 
messages to the client at any time, so a full PB implementation is not 
really possible when any message can be replied to with a "busy, poor 
implementation doesn't allow me to answer that message in this state" 
error.

For a lot of PB applications - those it was designed for, for example, 
online games - you absolutely need full two-way communication.
>In short, this feature does "make sense" in my environment. Whether 
>it's possible or not is another matter entirely.

I am still not convinced.  You can feel free to stop trying to convince 
me though, or you can write a patch which we can then discuss.
>You're "Deferreds are hard" comment is an insult. You make it sound 
>like I don't want to think.

You've also insulted me by implication of not living in the "real world" 
and being "drunk" on "Kool-Aide [sic]".  I think that this feature is a 
symptom of muddy thinking, since I've seen it dozens of times before, 
and I'm not going to apologize to you for thinking that.

The difference between the jabs we're trading here is that I'm not using 
any software that *you* wrote, and I'm not insulting you at the same 
time I'm posting to a mailing list for that software while demanding 
impossible features.
>If I didn't want to think I wouldn't be  be a software developer.

I don't think that you "don't want to think", I think that you're 
mistaken.  However, if indeed you didn't want to think, this is hardly a 
defense, as you'd clearly not be alone in the software development 
profession, such as it is.  c.f. http://worsethanfailure.com/
>This code obviously won't work because the getPage() has to wait and 
>another dataReceived() call could come in with a QUIT command while 
>the first one is still waiting for getPage(). Instead you'd need to 
>accumulate the data in a buffer and then do your command processing 
>logic after all data has been received--that is, if you want to use 
>blockOn(getPage(...))--it probably wouldn't be the smartest way to do 
>this because it would be nice to start getting pages before we  receive 
>all of the data. But this is just one case that doesn't work  with 
>blockOn(). I've never said that it would magically make every  case 
>easier, it just makes some less complicated cases very much  simpler.

It makes some cases appear simpler *at the expense* of breaking lots of 
other, correctly-written code, which depends on not having 20 levels of 
naive "blockOn" calls above them on the stack.  It's analogous to how 
there are restrictions on "user code" in UNIX and you're not allowed to 
handle interrupts directly because the point of the kernel is to allow 
multiple processes to run at the same time.  The original point of 
Twisted was a high degree of frustration that so many libraries for 
speaking protocols implemented their own, incompatible event-loops.
>Everything I've read about this issue suggests that the twisted 
>developers just don't want to give people what they want because it 
>would allow them to shoot themselves in the foot (for example, by 
>using blockOn() in a multi-threaded environment or in inappropriate 
>places such as the example above). But this is Python and we're 
>consenting adults. With the proper warnings a feature like this could 
>make twisted much more palatable for people with large existing 
>projects that do not wish to rewrite entire sections of code just to 
>work with deferreds. It would allow people to get the easiest thing 
>working as quickly as possible, and then go back and write the  optimal 
>deferred implementation later when performance/blocking/etc.  becomes 
>an issue.

I agree that it would be nice to allow programs to get on the Twisted 
bandwagon slowly, and to integrate more cleanly with foreign concurrency 
mechanisms like microthreads and database transactions.  This is exactly 
what Jim Fulton is working on with the multi-reactor stuff for ZEO.  You 
can't have one reentrant reactor, but you *can*, at least conceptually, 
have one reactor start another reactor and wait for it to complete a 
particular operation.  If you'd like to help other projects gradually 
adapt to Twisted, perhaps you would like to contribute something to 
ticket #2545.

To follow my earlier analogy, this is like the hypervisor and user-mode- 
kernel facilities in various UNIXes; if you're not allowed to do 
something in the kernel, it's OK to start your own kernel.
>Most people that would use blockOn() would probably use it in an 
>entirely synchronous fashion where there would only be one deferred 
>being processed at any given time. In these cases blockOn() would  work 
>just fine (if inefficiently). From your point of view that  probably 
>totally defeats the purpose of using twisted, but as I have  pointed 
>out above there are other useful features in twisted beside  its 
>deferred mechanism (PB).

... and as *I've* pointed out above, PB is only possible _because_ of 
Twisted's event loop.  In fact Deferreds were directly extracted from PB 
- originally every PB method had "pbcallback" and "pberrback" keyword 
arguments, and the Deferred class was the encapsulation of that so that 
PB methods could be easily chained and their results passed to other 
systems.
>The concept that I am thinking of seems entirely possible, although I 
>am sure it would require rewriting existing reactor implementations. 
>However, in the long run that seems like a small cost if twisted  could 
>be more widely adopted because it would play nicer with  existing non- 
>async code.

If you want to try and go implement this, you can discover just how 
small the cost is :).  If, in the course of implementing such a thing, 
you manage to get clean, coherent semantics for "blockOn", and it passes 
the full test suite (etc etc) I would not reject such a thing out of 
hand.  I am suggesting that it is impossible to get coherent semantics 
for blockOn, and if you submit an implementation I'll point out the 
specific brokenness of a particular approach, but my main point is that 
it's impossible because of specific problems, not that it's 
unacceptable.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: </pipermail/twisted-python/attachments/20070405/4d41b231/attachment.html>