[Twisted-Python] epoll reactor problems
Thomas Hervé
therve at free.fr
Wed Apr 11 05:47:11 MDT 2007
Quoting Alec Matusis <matusis at matusis.com>:
>> That's old (debian stable ? :)). I don't say that'll solve your
>> problem, but you
>> could try with 2.4.4 (warning, not 2.4.3).
>
> It's SuSE stable ;-) Our stuff on that machine is pretty convoluted now, so
> we will probably have a chance to test with 2.4.4 only in a week, when we
> add a brand new server with 2.4.4.
OK. That is just another thing to try, I don't see obvious reasons why
it could
work better on 2.4.4, but...
> I noticed a difference between this from the 99.9% CPU server:
>
> epoll_wait(4, {{EPOLLERR|EPOLLHUP, {u32=423, u64=12304606485815493031}},
> {EPOLLIN|EPOLLPRI|EPOLLOUT|EPOLLRDNORM|EPOLLRDBAND|EPOLLWRNORM|EPOLLWRBAND|E
> POLLMSG|EPOLLERR|EPOLLHUP|0x7820, {u32=5529648, u64=5529648}},
> {EPOLLIN|EPOLLPRI|EPOLLRDNORM|EPOLLRDBAND|EPOLLMSG|0x1000, {u32=0,
> u64=22827751178240}}, {0, {u32=0, u64=0}},
> {EPOLLOUT|EPOLLERR|EPOLLONESHOT|EPOLLET|0x3fffa820, {u32=32767,
> u64=18097643565645823}}}, 1432, 68) = 5
>
> and this from a normal server running at 5% CPU:
>
> epoll_wait(4, {{EPOLLIN, {u32=1769, u64=12304606485815494377}}, {0,
> {u32=4294944684, u64=140737488332716}}}, 1728, 17) = 2
>
> What does this mean?
The flags set on your sockets are generally EPOLLIN or EPOLLOUT: data
to read or
available for write. I don't know much about the other flags. EPOLLERR
is set if
the fd has been closed for example. EPOLLET is *highly* suspect, because it
should only be there if set in the user code. The documentation of other flags
is really terse...
>> What's the global state of the process? Memory, number of opened fd ?
>
> We immediately reverted to poll, so I do not have it in front of me. The RSS
> size was 45MB, and the number of open fd I do not know: it should have been
> about 1500, but I did not check.
Hum... it may come from running out of file descriptors, so you'd better check
your settings for this.
> I can do another test run with epoll in about 20hrs, since I do not want to
> upset users too much.
Of course :).
> If you have some specific data I should get from the
> test run, please let me know now.
Every information would be useful. The most useful information would be
to know
when it begins to act strangely, and if there is something that happend
at this
moment. Otherwise, number of fds, memory, netstat output, strace output...
--
Thomas
More information about the Twisted-Python
mailing list