[Twisted-Python] How do I debug this network problem?

exarkun at twistedmatrix.com exarkun at twistedmatrix.com
Fri Nov 14 13:36:07 MST 2014


On 05:38 pm, peter.westlake at pobox.com wrote:
>
>Here is some highly suspicious-looking output from strace:
>
>sendto(5, "\0\4_ask\0\004867c\0\10_command\0\tLogOutpu"..., 343, 0,
>NULL, 0) = 343 recvfrom(5,
>"\0\7_answer\0\004867c\0\6status\0\4True\0\0", 65536, 0, NULL, NULL) =
>31 sendto(5, "\0\4_ask\0\004867d\0\10_command\0\tLogOutpu"..., 183, 0,
>NULL, 0) = 183 sendto(5, "\0\4_ask\0\004867e\0\10_command\0\4Ping\0\0",
>30, 0, NULL, 0) = 30 recvfrom(5, 0x7fec3005c6d4, 65536, 0, 0, 0) = -1
>EAGAIN (Resource temporarily unavailable) sendto(5,
>"\0\4_ask\0\004867f\0\10_command\0\4Ping\0\0", 30, 0, NULL, 0) = 30
>recvfrom(5, "\0\7_answer\0\004867f\0\3ack\0\4True\0\0", 65536, 0, NULL,
>NULL) = 28
>
>Requests 867d and 867e never receive their callbacks.

But in this trace, it looks like those two calls also never received a 
response from the server - so it's correct that the callbacks never 
fire.

Keep in mind this is a totally valid AMP interaction.  The reason these 
things have unique identifiers is so responses can arrive out of order.

Earlier you mentioned you saw tcpdump reporting that the response data 
corresponding to the unfired Deferreds was arriving.  Can you double 
check that?  From this `strace` output I would guess that it isn't.  The 
alternative explanation is a terrible, data-losing bug in the kernel's 
TCP stack...
>I see that twisted.internet.tcp.Connection.doRead() does check for
>EWOULDBLOCK (=EAGAIN), and returns None. Shouldn't it try to read 
>again?

It will try again - later, when select or epoll or whatever tells it 
that there's a better chance of some data coming out.

Jean-Paul




More information about the Twisted-Python mailing list