[Twisted-Python] Synchronization techniques
glyph at divmod.com
glyph at divmod.com
Thu Apr 5 03:53:37 MDT 2007
On 04:25 am, daniel at keystonewood.com wrote:
>On Apr 4, 2007, at 1:43 PM, glyph at divmod.com wrote:
>>On 04:35 pm, daniel at keystonewood.com wrote:
>> >On Apr 3, 2007, at 8:42 PM, Itamar Shtull-Trauring wrote:
>> >>On Tue, 2007-04-03 at 17:07 -0400, Daniel Miller wrote:
>Well of course it's no big deal to change IQuoter, but that specific
>case wasn't really my point. My point is that in the real world it's a
>BAD THING to have to rewrite perfectly good/working/tested code just
>because we want to use twisted. But this is exactly what happened to
>me when twisted was introduced into my project.
Hmm. Well, I don't know about the "real world" - haven't visited in a
while - but in the magical fairy kingdom where *I* live, it is generally
considered a good idea to globally consider the implications of a new
programming model on "good/working/tested code". Networks and
concurrency, in particular, have this pesky habit of introducing
entirely new error conditions into previously "working" code, breaking
all of its assumptions. This isn't specific to Twisted, but Twisted
does deal with networks and concurrency quite a bit.
>>I'm afraid that the feature you want doesn't make any sense and is,
>>in a broad sense, impossible.
>Maybe it's impossible for you to see things the way I see them because
>you have become drunk on Twisted Kool-Aide.
You are making it sound like you are bringing a fresh new idea to the
discussion here, which I've never heard before and am unwilling to
consider because I'm inflexible in my thinking. That's not what's
happening. This is a frequently-debunked and well-understood issue in
the Twisted community. It seems to come up about once per year. See my
now apparently prescient blog post as of last April:
http://glyf.livejournal.com/40037.html
>In my specific case I am running twisted in a single-threaded
>environment with a single synchronized resource where each request
>that needs to access that resource must gain an exclusive lock before
>doing anything with it (a classic locking scenario). This is not "I'm
>being lazy and I do not want to learn how to use Deferreds." Rather,
>it is a requirement that is dictated by the system with which I am
>communicating (it does not support concurrent access through the API
>provided by the vendor). Thus, my code would be much simpler (both to
>write and maintain) if I had blockOn(), and it would not have any risk
>of dead lock or other such concurrency bugs.
You're confusing two things here.
On the one hand, you want mutual exclusion for an external resource,
which blocks.
On the other, you want semantics for implementing that mutual exclusion
via blocking in your own process.
The former, as you have already demonstrated, can be implemented without
the latter. The question is, would your code actually be simpler to
write and to maintain if you had blockOn? Nothing you've said so far
indicates that it would actually be more maintainable, and I've tried
(although perhaps failed) to illustrate the high cost of the *apparent*
simplicity at the moment of implementation.
It strikes me that the process actually making the foreign API call
could just block "for real" which would solve the mutual exclusion issue
- callers into the PB layer would appear to be getting concurrent
access, but responses would be implicitly queued up.
Another solution here would be for Twisted to have a nice convenience
API for dispatching tasks to a process pool. Right now setting up a
process pool is conceptually easy but mechanically difficult; you have
to do a lot of typing and make a lot of irrelevant decisions (AMP or PB
or pickle? stdio or sockets?).
>You might ask why I bother to use Twisted? -- Perspective Broker is the
>most elegant way I could find to call remote methods in Python. If it
>were abstracted from Twisted to become a fully synchronous library I
>would use that instead, but at this point it seems that if I want PB I
>am stuck with Twisted too.
This is another area where the feature request doesn't quite make sense.
It would be possible to implement something that looked kinda-sorta like
PB, which dealt with a single request/response pair over a single
socket, in an apparently synchronous and blocking manner. However, PB
itself is a fully symmetrical protocol where the server can send
messages to the client at any time, so a full PB implementation is not
really possible when any message can be replied to with a "busy, poor
implementation doesn't allow me to answer that message in this state"
error.
For a lot of PB applications - those it was designed for, for example,
online games - you absolutely need full two-way communication.
>In short, this feature does "make sense" in my environment. Whether
>it's possible or not is another matter entirely.
I am still not convinced. You can feel free to stop trying to convince
me though, or you can write a patch which we can then discuss.
>You're "Deferreds are hard" comment is an insult. You make it sound
>like I don't want to think.
You've also insulted me by implication of not living in the "real world"
and being "drunk" on "Kool-Aide [sic]". I think that this feature is a
symptom of muddy thinking, since I've seen it dozens of times before,
and I'm not going to apologize to you for thinking that.
The difference between the jabs we're trading here is that I'm not using
any software that *you* wrote, and I'm not insulting you at the same
time I'm posting to a mailing list for that software while demanding
impossible features.
>If I didn't want to think I wouldn't be be a software developer.
I don't think that you "don't want to think", I think that you're
mistaken. However, if indeed you didn't want to think, this is hardly a
defense, as you'd clearly not be alone in the software development
profession, such as it is. c.f. http://worsethanfailure.com/
>This code obviously won't work because the getPage() has to wait and
>another dataReceived() call could come in with a QUIT command while
>the first one is still waiting for getPage(). Instead you'd need to
>accumulate the data in a buffer and then do your command processing
>logic after all data has been received--that is, if you want to use
>blockOn(getPage(...))--it probably wouldn't be the smartest way to do
>this because it would be nice to start getting pages before we receive
>all of the data. But this is just one case that doesn't work with
>blockOn(). I've never said that it would magically make every case
>easier, it just makes some less complicated cases very much simpler.
It makes some cases appear simpler *at the expense* of breaking lots of
other, correctly-written code, which depends on not having 20 levels of
naive "blockOn" calls above them on the stack. It's analogous to how
there are restrictions on "user code" in UNIX and you're not allowed to
handle interrupts directly because the point of the kernel is to allow
multiple processes to run at the same time. The original point of
Twisted was a high degree of frustration that so many libraries for
speaking protocols implemented their own, incompatible event-loops.
>Everything I've read about this issue suggests that the twisted
>developers just don't want to give people what they want because it
>would allow them to shoot themselves in the foot (for example, by
>using blockOn() in a multi-threaded environment or in inappropriate
>places such as the example above). But this is Python and we're
>consenting adults. With the proper warnings a feature like this could
>make twisted much more palatable for people with large existing
>projects that do not wish to rewrite entire sections of code just to
>work with deferreds. It would allow people to get the easiest thing
>working as quickly as possible, and then go back and write the optimal
>deferred implementation later when performance/blocking/etc. becomes
>an issue.
I agree that it would be nice to allow programs to get on the Twisted
bandwagon slowly, and to integrate more cleanly with foreign concurrency
mechanisms like microthreads and database transactions. This is exactly
what Jim Fulton is working on with the multi-reactor stuff for ZEO. You
can't have one reentrant reactor, but you *can*, at least conceptually,
have one reactor start another reactor and wait for it to complete a
particular operation. If you'd like to help other projects gradually
adapt to Twisted, perhaps you would like to contribute something to
ticket #2545.
To follow my earlier analogy, this is like the hypervisor and user-mode-
kernel facilities in various UNIXes; if you're not allowed to do
something in the kernel, it's OK to start your own kernel.
>Most people that would use blockOn() would probably use it in an
>entirely synchronous fashion where there would only be one deferred
>being processed at any given time. In these cases blockOn() would work
>just fine (if inefficiently). From your point of view that probably
>totally defeats the purpose of using twisted, but as I have pointed
>out above there are other useful features in twisted beside its
>deferred mechanism (PB).
... and as *I've* pointed out above, PB is only possible _because_ of
Twisted's event loop. In fact Deferreds were directly extracted from PB
- originally every PB method had "pbcallback" and "pberrback" keyword
arguments, and the Deferred class was the encapsulation of that so that
PB methods could be easily chained and their results passed to other
systems.
>The concept that I am thinking of seems entirely possible, although I
>am sure it would require rewriting existing reactor implementations.
>However, in the long run that seems like a small cost if twisted could
>be more widely adopted because it would play nicer with existing non-
>async code.
If you want to try and go implement this, you can discover just how
small the cost is :). If, in the course of implementing such a thing,
you manage to get clean, coherent semantics for "blockOn", and it passes
the full test suite (etc etc) I would not reject such a thing out of
hand. I am suggesting that it is impossible to get coherent semantics
for blockOn, and if you submit an implementation I'll point out the
specific brokenness of a particular approach, but my main point is that
it's impossible because of specific problems, not that it's
unacceptable.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: </pipermail/twisted-python/attachments/20070405/4d41b231/attachment.html>
More information about the Twisted-Python
mailing list