[Twisted-Python] Twisted receiving buffers swamped?

Tobias Oberstein tobias.oberstein at tavendo.de
Sat Jan 10 01:33:38 MST 2015


Hi glyph,

>>I get strange results.
>>Sluggish performance:

>Did you ever diagnose this further?  This seems like the sort of thing that we should start having a performance test for.

Not yet. I didn't reply again since you gave me enough homework already:

- Run the producer/consumer variant on Linux (bisecting BSD/kqueue)
- Do the memory profiling with non-producer/consumer (tracking down _where_ memory runs away)

Other stuff interrupted me again, and my impression is, that it might be significant effort to really track this down. No surprise here: really pushing things often means "issues" pop up.

I absolutely agree: we should have repeatable, comparable, standard performance tests.

Like we have with trial/buildbot, but for performance, not functional tests.

FWIW, here are my thoughts on this: 


1)
A simple Twisted based "TCP echo server" (maybe in non-producer/consumer and producer/consumer variants) as a testee will already allow us to do a _lot_.
We can come up with more testees later (e.g. Twisted Web with static resource, ...).

2)
It might be wise to use a non-Twisted, standard test load generator like netperf, instead of a Twisted based one.
- having the load generator written in Twisted creates a cyclic dependency (e.g. rgd. interpreting results)
- it allows to compare results to non-Twisted setups and allows others to repeat against their stuff

3)
We should include at least 2 operating systems (FreeBSD / Linux).
This allows to quickly bisect OS or Twisted reactor specific issues.

4)
We should run this on real, physical, non-virtualized, dedicated hardware and networking gear.
I can't stress enough how important this is in my experience:
Any form of virtualization brings a whole own dimension of factors/variability into the game.
Testing in VMs on a shared hypervisor on a public cloud: you never really know, you never really can repeat.
Repeatability is absolutely crucial.

5)
The load generator and the testee should run on 2 separate boxes, connected via real network (e.g. switched ether).
Testing via loopback is often misleading, and practically often irrelevant (too far away from production deployments).

6)
We should test on both CPython and PyPy.
Because this is where stuff actually runs later in production. And for bisecting Python implementation specifics.

7)
It should be automated.

8)
The results should be stored in a long term archive (a database) so we can compare results over time / setups.

9)
We should collect monitoring parameters (CPU load ...) on both the load generator and testee boxes during test runs.
Like, "same network perf., but one triggers double the CPU load" ..

===

Because of 3/4/5, this requires 4 boxes to begin with. Those should be absolutely _identical_.

Currently, we (Tavendo) have a setup dedicated to performance tests consisting of 2 boxes with dual port 10GbE and a 8 port 10GbE switch.

Buying 2 more identical boxes and adding those would be technically possible. 7/8/9 and setting this all up is work.

I would need to somehow justify/book these investments. I have "ideas" about that, but step by step: what do you think about above?

/Tobias





More information about the Twisted-Python mailing list