[Twisted-Python] Re: Twisted and the Posh Module

Bob Ippolito bob at redivi.com
Mon Mar 14 13:42:55 MST 2005


On Mar 14, 2005, at 3:24 PM, Ken Kinder wrote:

> Ed Suominen wrote:
>
>> I think having some sort of process pool mangagement in Twisted is a 
>> great idea, especially with multi-core CPUs emerging on the scene. I 
>> have access to a dual-core Pentium Prescott CPU and it would be 
>> great to have that available to keep both cores humming on a certain 
>> CPU-intensive project I'm considering.
>>
>> However, I'm not sure the best way to go about it would be with the 
>> posh module, pypar (see http://datamining.anu.edu.au/~ole/pypar/), 
>> or just using Perspective Broker as an underlying message-passing 
>> mechanism with UNIX sockets and/or TCP. One thought might be to have 
>> a single master process start up and act as a PB server and process 
>> pool manager. Subsidiary processes could then make authenticated PB 
>> connections to the server to "volunteer" for work in the process 
>> pool.
>> Note that pypar lets you easily find out how many CPUs you have 
>> under kernel control, with pypar.size(). Thus, the main process 
>> could start the process pool by spawning a subsidiary "volunteer" 
>> process for each CPU core present.
>>
> My goal is somewhat similar in that I have multiple CPU's that aren't 
> being used very much, but the other problem is that certain libraries 
> I use (PIL) grab the GIL for long-running operations like image 
> modifications. I can thread them off, but it still liberally acquires 
> the GIL and makes the server unresponsive for the duration of the 
> operation.
>
> I too had looked at using Perspective Broker to communicate with 
> separate "worker" processes, and the only reason I'm not excited with 
> that option is that the bandwidth between the master and worker 
> processes involves a lot of large binary strings. Shared memory 
> seemed more efficient than PB for transferring those strings.
>
> So for that reason, finding out how many CPU's I have isn't that 
> important, because I'll still want more worker processes than I have 
> CPU's.

Ideally we'd have a message-passing system that could use multiple 
backends (i.e. shared memory, mmap, or sockets).  Using sockets is 
probably a better solution for now -- you're likely to do a lot of 
copying anyway, cause it's Python and PIL :)

With sockets, you can scale right to multiple computers.. with shared 
memory, you're stuck on a single box.  The API that POSH exposes 
(proxied non-blocking objects) can't scale well to multiple machines, 
where a socket-based API could be scaled down to actually use an 
efficient shared memory implementation at some point.

-bob






More information about the Twisted-Python mailing list