[Twisted-Python] setTimeout in Deferred
Yun Mao
maoy at cis.upenn.edu
Wed Mar 9 21:53:58 MST 2005
On Wed, 9 Mar 2005, Christopher Armstrong wrote:
> It's a bit more complex than that, and its input and behavior makes
> some wrong assumptions and has some bad implications. It is the
> end-user of a Deferred who usually wants to specify the timeout before
> cancelling the operation. However, it is the creator of the Deferred
> who needs to specify _how_ cancelling an operation works -- it's
> different for every protocol / kind of operation. Putting the timeout
> and the canceller next to each other is a bad idea. In practice, using
> setTimeout leads to a bunch of stupid AlreadyCalled errors when the
> end-user of a Deferred specifies a timeout and the framework code
> doesn't know about it.
>
> The best way to offer timeout support in your code is to modify your
> API. If you have a function like getPage which connects to an HTTP
> server and downloads a resource, that getPage function should take a
> timeout parameter. If set, the HTTP-downloader should disconnect from
> the server when the timeout is reached, and simply send an errback to
> the deferred.
>
Thanks for the insight.
The problem is: many of the result-generating APIs do not have a timeout
option, for example, XMLRPC calls, Database queries, etc. What can we do?
Are you suggesting to modify those APIs? I believe the framework should
have something related to that.
I could imagine to write a TimerDeferred class or something to wrap the
original Deferred, which handles timeout, and prevents AlreadyCalledError
from happening. The tricky thing is, when those timed out calls return
with callback/errback, there should be cancellation handlers to clean up
the resources, like lateCallback(), lateErrback(). Of course, the simplest
solution is to discard the results (just like what DeferredList does
when fireOnOneCallback/errback is specified), but it is protocol specific.
Even with APIs that have the timeout option, there could also be problems
when you use them in some form of combination. E.g. I'd write a procedure
which involves one DNSLookup and one getPage() sequentially. I could
specify timeout value with 2 seconds each, however what I really want the
total time to be less than 4 seconds in which case DNS could take longer.
It is not terribly hard to implement base on what we have now but it would
be nice if there is a framework support that can calculate the correct
timers for me.
Another form of "cancellation" is also related to the chaining effect.
Suppose I want to start following two jobs at the same time:
1: A()->B()-C()
2: D()->E()
However, I only need the result from any of of them. i.e. as long as one
of them is finished, the other thread is useless and should be canceled
ASAP to prevent from resource consuming. Right now, I'm using
DeferredList, but it doesn't cancel the other thread. A similar
requirement is that if either one of them got errback, the other should be
canceled.
Anyway, being able to schedule flexible timers is a very big advantage in
async programming over multithreading to me. Otherwise, signals, locks,
etc could just drive me nuts. However, I still believe in Twisted has the
potential to have better framework support.
Yun
More information about the Twisted-Python
mailing list