[Twisted-Python] strange server crash
Alec Matusis
matusis at yahoo.com
Sat Mar 21 21:02:03 MDT 2009
I have been operating a busy twisted TCP server for quite some time (1
year+).
Yesterday, it strangely crashed under peak load for that day (it has been
under more load in the past).
# grep segfault /var/log/messages
Mar 20 19:12:15 serv2 kernel: [17687209.144548] twistd[10701]: segfault at 8
rip 41eccf rsp 7fff33675270 error 6
I cannot understand the reason of the crash, except that its some very rare
race condition.
While the code is quite cumbersome, here are some general details:
8 min before the crash, the server started giving repeated error on clients
disconnect, that I have never seen before in the last year.
It first produced this twice:
2009/03/20 19:04 -0700 [ClientProtocol,318930,70.132.227.150] Unhandled
Error
Traceback (most recent call last):
File
"/nail/encap/Twisted-2.5.0/lib/python2.4/site-packages/twisted/application/a
pp.py", line 113, in runReactorWithLogging
reactor.run()
File
"/nail/encap/Twisted-2.5.0/lib/python2.4/site-packages/twisted/internet/posi
xbase.py", line 220, in run
self.mainLoop()
File
"/nail/encap/Twisted-2.5.0/lib/python2.4/site-packages/twisted/internet/posi
xbase.py", line 231, in mainLoop
self.doIteration(t)
File
"/nail/encap/Twisted-2.5.0/lib/python2.4/site-packages/twisted/internet/epol
lreactor.py", line 181, in doPoll
log.callWithLogger(selectable, _drdw, selectable, fd, event)
--- <exception caught here> ---
File
"/nail/encap/Twisted-2.5.0/lib/python2.4/site-packages/twisted/python/log.py
", line 48, in callWithLogger
return callWithContext({"system": lp}, func, *args, **kw)
File
"/nail/encap/Twisted-2.5.0/lib/python2.4/site-packages/twisted/python/log.py
", line 33, in callWithContext
return context.call({ILogContext: newCtx}, func, *args, **kw)
File
"/nail/encap/Twisted-2.5.0/lib/python2.4/site-packages/twisted/python/contex
t.py", line 59, in callWithContext
return self.currentContext().callWithContext(ctx, func, *args,
**kw)
File
"/nail/encap/Twisted-2.5.0/lib/python2.4/site-packages/twisted/python/contex
t.py", line 37, in callWithContext
return func(*args,**kw)
File
"/nail/encap/Twisted-2.5.0/lib/python2.4/site-packages/twisted/internet/epol
lreactor.py", line 212, in _doReadOrWrite
self._disconnectSelectable(selectable, why, inRead)
File
"/nail/encap/Twisted-2.5.0/lib/python2.4/site-packages/twisted/internet/posi
xbase.py", line 252, in _disconnectSelectable
selectable.readConnectionLost(f)
File
"/nail/encap/Twisted-2.5.0/lib/python2.4/site-packages/twisted/internet/tcp.
py", line 405, in readConnectionLost
self.connectionLost(reason)
File
"/nail/encap/Twisted-2.5.0/lib/python2.4/site-packages/twisted/internet/tcp.
py", line 416, in connectionLost
protocol.connectionLost(reason)
File "/nail/live/im/myserv/protocols.py", line 170, in
connectionLost
self.session.logout()
File "/nail/live/im/myserv/service.py", line 1199, in logout
if self.nick in session.sellers:
exceptions.AttributeError: Deferred instance has no attribute
'sellers'
and then a different error in the same line with the same call stack, that
was repeated about 1000 times for different logging out clients:
2009/03/20 19:04 -0700 [ClientProtocol,326158,99.245.246.187] Unhandled
Error
Traceback (most recent call last):
File
"/nail/encap/Twisted-2.5.0/lib/python2.4/site-packages/twisted/application/a
pp.py", line 113, in runReactorWithLogging
reactor.run()
File
"/nail/encap/Twisted-2.5.0/lib/python2.4/site-packages/twisted/internet/posi
xbase.py", line 220, in run
self.mainLoop()
File
"/nail/encap/Twisted-2.5.0/lib/python2.4/site-packages/twisted/internet/posi
xbase.py", line 231, in mainLoop
self.doIteration(t)
File
"/nail/encap/Twisted-2.5.0/lib/python2.4/site-packages/twisted/internet/epol
lreactor.py", line 181, in doPoll
log.callWithLogger(selectable, _drdw, selectable, fd, event)
--- <exception caught here> ---
File
"/nail/encap/Twisted-2.5.0/lib/python2.4/site-packages/twisted/python/log.py
", line 48, in callWithLogger
return callWithContext({"system": lp}, func, *args, **kw)
File
"/nail/encap/Twisted-2.5.0/lib/python2.4/site-packages/twisted/python/log.py
", line 33, in callWithContext
return context.call({ILogContext: newCtx}, func, *args, **kw)
File
"/nail/encap/Twisted-2.5.0/lib/python2.4/site-packages/twisted/python/contex
t.py", line 59, in callWithContext
return self.currentContext().callWithContext(ctx, func, *args,
**kw)
File
"/nail/encap/Twisted-2.5.0/lib/python2.4/site-packages/twisted/python/contex
t.py", line 37, in callWithContext
return func(*args,**kw)
File
"/nail/encap/Twisted-2.5.0/lib/python2.4/site-packages/twisted/internet/epol
lreactor.py", line 212, in _doReadOrWrite
self._disconnectSelectable(selectable, why, inRead)
File
"/nail/encap/Twisted-2.5.0/lib/python2.4/site-packages/twisted/internet/posi
xbase.py", line 252, in _disconnectSelectable
selectable.readConnectionLost(f)
File
"/nail/encap/Twisted-2.5.0/lib/python2.4/site-packages/twisted/internet/tcp.
py", line 405, in readConnectionLost
self.connectionLost(reason)
File
"/nail/encap/Twisted-2.5.0/lib/python2.4/site-packages/twisted/internet/tcp.
py", line 416, in connectionLost
protocol.connectionLost(reason)
File "/nail/live/im/myserv/protocols.py", line 170, in
connectionLost
self.session.logout()
File "/nail/live/im/myserv/service.py", line 1199, in logout
if self.nick in session.sellers:
exceptions.AttributeError: WLResultSet instance has no attribute
'sellers'
This last error repeated about 1000 times for different clients, after which
the segfault occurred.
The problem is, that session is an instance of a Session class that is
defined in the code, it is NEITHER a Deferred NOR an instance of
WLResultSet class- these are completely different classes, and theres no
possible way in the code for variable session to become an instance of
Deferred.
The code is approximately like this:
def logout(self):
self.on_logout(self.do_logout)
for session in self.service.client_sessions: ß self.service.client_sessions
is a dictionary of custom Session class instances
if self.nick in session.sellers: ß error here, it claims that session is a
Deferred or sometimes an instance of a completely unrelated class
#Some statements
The function self.on_logout sometimes (rarely) dispatches a deferred (it has
maybeDeferred in it), such that its argument self.on_logout is called when
that deferred returns (otherwise its called synchronously).
My question is, is it possible to muck up deferred in such a way, that after
dispatching a deferred chain inside self.on_logout(self.do_logout) , the
remaining statements would be executed in some unexpected local scope, such
that I would see these weird errors where my for loop iterator would
suddenly go over Deferred instances, instead of the dictionary of Session
instances that I define in the code ??
Also, why did these errors lead to segfault of the whole twisted process
after 8 minutes?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: </pipermail/twisted-python/attachments/20090321/825facf2/attachment.html>
More information about the Twisted-Python
mailing list