[Twisted-Python] Questions about the very nice AMP protocol

Brian Granger ellisonbg.net at gmail.com
Thu Nov 16 23:23:43 MST 2006


Hi,

I currently use Perspective Broker for a number of projects.  As time
has gone by I have really come to appreciate having a full two-way
network protocol.  But, my company has lots of Java programmers  and
they do lots of "serious" (read, pain in the ass) web services and
grid services stuff.  I would like to be able to get the many Python
things I have playing nicely with the many Java things floating around
here.

Thus, AMP is extremely attractive.  There is one problem that I have
though.  We do high performance scientific computing and deal with
extremely large tera/peta-byte data sets.
Thus we need network protocols that can send large amounts of data
around.  The focus of AMP
of small messages thus presents a problem.  There are really two usage
cases that I have in mind:

1.  Sending larger (maybe 100's of Mb) objects around that do fit in
memory.  These can be serialized easily (w/o creating a big pickle),
but I need to make sure that Twisted doesn't make extra copies of them
during the transfer.

2.  Sending even bigger things that don't fit into memory.

Any thoughts on the best way to address these questions using AMP.
Here are my thoughts:

1.  Use a multi-connection approach like FTP does.  Use AMP for
control and the other connection
for the binary data.  It would be easy to use producers/consumers in
this channel to handle the large data problems above.  I don't like
this because I often need to ssh tunnel the protocol through firewalls
- two connections is unpleasant.

2.  Use AMP's inner protocol to run two protocols simultaneously.  My
understanding is that AMP doesn't support switching back and forth
between AMP and its inner protocol.  Would it be crazy to try this
approach?

3.  Try to modify AMP itself to handle the large objects itself by
registering Producers with the underlying transport.

It may sound like I just want something like FTP, but I also need to
send lots application specific control messages as well - and these
really need to be two way.

Any thoughts would be greatly appreciated.

Brian




More information about the Twisted-Python mailing list